I have the following dataset, and I need to remove rows if they are all empty or have same value across all the columns:
df <- data.frame(players=c('', 'Uncredited', 'C', 'D', 'E'),
assists=c("", "Uncredited", 4, 4, 3),
ratings=c("", "Uncredited", 4, 7, ""))
df
players assists ratings
<chr> <chr> <chr>
Uncredited Uncredited Uncredited
C 4 4
D 4 7
E 3
In our example, the 1st row is all empty and the 2nd row has the same value of Uncredited
. Hence, the 1st two rows would be removed.
Desired Output
players assists ratings
<chr> <dbl> <chr>
C 4 4
D 4 7
E 3
Any suggestions would be appreciated. Thanks!
CodePudding user response:
You can use apply to loop over all rows and filter for those that have more than a single distinct value. Note that if all value in a row are empty the row also has only one distinct value, so the first condition is part of the second condition.
df[apply(df,
MARGIN = 1, # rowwise
FUN = function(x) length(unique(x)) > 1), ]
#> players assists ratings
#> 3 C 4 4
#> 4 D 4 7
#> 5 E 3
CodePudding user response:
We could use if_any
library(dplyr)
df %>%
filter(if_any(assists:ratings, ~ .x != players))
-output
players assists ratings
1 C 4 4
2 D 4 7
3 E 3