Delete all but first instance in data frame when the rows aren't duplicates in R [duplicate]-CodePudding

I have a data frame and it looks something like the first df below. Theres duplicates in col1 but not col2. I want to remove all of the duplicate rows except the first row so that it looks like the second df below.

col1	col2
x	1
x	2
x	3
y	1
y	2
y	3

col1	col2
x	1
y	1

I tried this but it didn't work:

df %>% group_by(col1) %>% filter(duplicated(col1) | n()!=1)

CodePudding user response：

We need just distinct

library(dplyr)
distinct(df, col1, .keep_all = TRUE)
  col1 col2
1    x    1
2    y    1

Or if we want to use duplicated, negate (!) and return the first row

df %>%
    filter(!duplicated(col1))
  col1 col2
1    x    1
2    y    1

data

df <- structure(list(col1 = c("x", "x", "x", "y", "y", "y"), col2 = c(1L, 
2L, 3L, 1L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-6L))

CodePudding user response：

Why not:

df[ !duplicated( df$col1) , ]