I have a dataframe that looks like this:
col1 | col2 | col3 |
---|---|---|
tn1 | a | b |
tn1 | a | c |
tn2 | d | b |
tn3 | a | b |
And I want to leave only those rows that are duplicated for col1 & col2, keeping BOTH rows:
col1 | col2 | col3 |
---|---|---|
tn1 | a | b |
tn1 | a | c |
I've been trying to do this by using unique() or distinct() or anti_join() but can't figure it out.
CodePudding user response:
Base R:
df[df$col1 %in% df$col1[duplicated(df$col1)],]
col1 col2 col3
1 tn1 a b
2 tn1 a c
CodePudding user response:
Found this and worked
df %>% group_by(col1) %>% filter((duplicated(col2) | duplicated(col2, fromLast = T)))