I have a data like this
df<- structure(list(Core = c("Bestman", "Tetra"), member1 = c("Tera1",
"Brownie1"), member2 = c("Tera2", "Brownie2"), member3 = c("Tera3",
"Brownie3"), member4 = c("Tera4", "Brownie4"), member5 = c("Tera5",
"Brownie5"), member6 = c("Brownie2", "Tera2"), member7 = c("Tera1",
"Tera1"), member8 = c("Tera2", "")), class = "data.frame", row.names = c(NA,
-2L))
it looks like this
Core member1 member2 member3 member4 member5 member6 member7 member8
Bestman Tera1 Tera2 Tera3 Tera4 Tera5 Brownie2 Tera1 Tera2
Tetra Brownie1 Brownie2 Brownie3 Brownie4 Brownie5 Tera2 Tera1
if we look at the first row , we can see that Tera1 and Tera2 are repeated which must be deleted
when we go to the next row
we can see
Brownie2, Tera1 and Tera2 are repeated and must be deleted
my desire output looks like this
Core member1 member2 member3 member4 member5 member6
Bestman Tera1 Tera2 Tera3 Tera4 Tera5 Brownie2
Tetra Brownie1 Brownie3 Brownie4 Brownie5
CodePudding user response:
One way could be with pivoting:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(-Core) %>%
distinct(value, .keep_all = TRUE) %>%
pivot_wider(names_from=name, values_from = value)
Core member1 member2 member3 member4 member5 member6 member8
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Bestman Tera1 Tera2 Tera3 Tera4 Tera5 Brownie2 NA
2 Tetra Brownie1 NA Brownie3 Brownie4 Brownie5 NA
CodePudding user response:
If we are interested in any duplicates to be NA, then an option is to apply duplicated
on the vector of values from all columns except the first and assign those duplicates to NA
in the original data
df[-1][matrix(duplicated(c(t(df[-1]))), nrow = nrow(df),
byrow = TRUE)] <- NA_character_
-output
> df
Core member1 member2 member3 member4 member5 member6 member7 member8
1 Bestman Tera1 Tera2 Tera3 Tera4 Tera5 Brownie2 <NA> <NA>
2 Tetra Brownie1 <NA> Brownie3 Brownie4 Brownie5 <NA> <NA>
Then, we subset the columns based on columns having at least one non-NA, non-blank value
df1 <- df[colSums(is.na(df)|df == "") < nrow(df)]
df1
Core member1 member2 member3 member4 member5 member6
1 Bestman Tera1 Tera2 Tera3 Tera4 Tera5 Brownie2
2 Tetra Brownie1 <NA> Brownie3 Brownie4 Brownie5 <NA>