I have a dummy dataframe with four columns.
df <- data.frame(City = c("A","A","A","B","B","B","B"),
Name=c("Jon", "Bill","Bill", "Maria", "Ben", "Tina",'Tina'),
Age = c(23, 41, 32, 58, 26, 12, 15),
Eye_color=c("Blue","Blue", "Brown", "Brown", "Blue", "Blue","Brown"))
City Name Age Eye_color
1 A Jon 23 Blue
2 A Bill 41 Blue
3 A Bill 32 Brown
4 B Maria 58 Brown
5 B Ben 26 Blue
6 B Tina 12 Blue
7 B Tina 15 Brown
I want to remove duplicates in Names (Bill and Tina) with two different cases:
First case: group by City and remove duplicates in Names keeping the Blue eyed only. Result 1 should look like this:
City Name Age Eye_color
1 A Jon 23 Blue
2 A Bill 41 Blue
3 B Maria 58 Brown
4 B Ben 26 Blue
5 B Tina 12 Blue
Second case: I want to specify that if the city is A, between the duplicates in Names keep Blue eye, if the City is B between the duplicates in Name keep the Brown eye.
Result 2 should look like this:
City Name Age Eye_color
1 A Jon 23 Blue
2 A Bill 41 Blue
3 B Maria 58 Brown
4 B Ben 26 Blue
5 B Tina 15 Brown
Thanks for the help!
CodePudding user response:
Here is one possibility using filter
and dplyr
:
First we filter for Eye_color == Blue
but only if one row contains ´Blue`.
df %>%
group_by(Name) %>%
filter(if (any(Eye_color == "Blue")) Eye_color == "Blue" else TRUE) %>%
ungroup()
In the second case we use if_else
in the filter
statement:
df %>%
filter(if_else(Name == "Bill", Eye_color == "Blue", if_else(Name == "Tina", Eye_color == "Brown", TRUE)))
Update
For the new dataset you can use the same code for part 1. For part 2 simply replace the logical statements inside if_else
:
df %>%
filter(if_else(City == "A", Eye_color == "Blue", if_else(City == "B", Eye_color == "Brown", TRUE)))
CodePudding user response:
You can use this code:
df1 <- df %>%group_by(Name) %>% filter(Eye_color == "Blue")
df2 <- df %>% filter(if_else(Name == "Bill", Eye_color == "Blue", if_else(Name == "Tina", Eye_color == "Brown", TRUE)))