I have the following test df :
df1 <- data.frame(site = c('1' , '1' , '1' , '1' , '2' , '2' ,
'2' , '2' , '3' , '3' , '3' , '3') ,
species = c('A' , 'B' , 'C' , 'D' , 'A' , 'B' ,
'C' , 'D' , 'A' , 'B' , 'C' , 'D') ,
value = c('1' , '0' , '0' , '4' , '0' , '0' ,
'3' , '4' , '0' , '0' , '0' , '1'))
I need to filter out species only if they have a value of 0 for every site. I need to leave species and 0s if they have at least one value >= 1 in at least one site.
A tidyverse method is preferred.
CodePudding user response:
You can try this (with suggestion from benson23)
library(dplyr)
df1 %>%
group_by(species) %>%
filter(!all(value == "0"))
# A tibble: 9 × 3
# Groups: species [3]
site species value
<chr> <chr> <chr>
1 1 A 1
2 1 C 0
3 1 D 4
4 2 A 0
5 2 C 3
6 2 D 4
7 3 A 0
8 3 C 0
9 3 D 1
CodePudding user response:
Your value
column is a factor class, so we need to compare their numeric value to zero before filtering:
library(dplyr)
df1 %>%
group_by(species) %>%
filter(any(as.numeric(as.character(value)) >= 1))
# # A tibble: 9 x 3
# # Groups: species [3]
# site species value
# <fct> <fct> <fct>
# 1 1 A 1
# 2 1 C 0
# 3 1 D 4
# 4 2 A 0
# 5 2 C 3
# 6 2 D 4
# 7 3 A 0
# 8 3 C 0
# 9 3 D 1
CodePudding user response:
dplyr
using any
with filter
:
df1 <- data.frame(site = c('1' , '1' , '1' , '1' , '2' , '2' ,
'2' , '2' , '3' , '3' , '3' , '3') ,
species = c('A' , 'B' , 'C' , 'D' , 'A' , 'B' ,
'C' , 'D' , 'A' , 'B' , 'C' , 'D') ,
value = c('1' , '0' , '0' , '4' , '0' , '0' ,
'3' , '4' , '0' , '0' , '0' , '1'))
library(dplyr)
df1 %>%
group_by(species) %>%
filter(any(value != 0))
#> # A tibble: 9 × 3
#> # Groups: species [3]
#> site species value
#> <chr> <chr> <chr>
#> 1 1 A 1
#> 2 1 C 0
#> 3 1 D 4
#> 4 2 A 0
#> 5 2 C 3
#> 6 2 D 4
#> 7 3 A 0
#> 8 3 C 0
#> 9 3 D 1
Created on 2022-07-29 by the reprex package (v2.0.1)
base R
option:
df1 <- data.frame(site = c('1' , '1' , '1' , '1' , '2' , '2' ,
'2' , '2' , '3' , '3' , '3' , '3') ,
species = c('A' , 'B' , 'C' , 'D' , 'A' , 'B' ,
'C' , 'D' , 'A' , 'B' , 'C' , 'D') ,
value = c('1' , '0' , '0' , '4' , '0' , '0' ,
'3' , '4' , '0' , '0' , '0' , '1'))
subset(df1, ave(value != 0, species, FUN = any))
#> site species value
#> 1 1 A 1
#> 3 1 C 0
#> 4 1 D 4
#> 5 2 A 0
#> 7 2 C 3
#> 8 2 D 4
#> 9 3 A 0
#> 11 3 C 0
#> 12 3 D 1
Created on 2022-07-29 by the reprex package (v2.0.1)
CodePudding user response:
Using base R
with %in%
- subset the 'species' where 'value' is not equal to 0, then create the logical expression with 'species' from the entire dataset on the species subset
subset(df1, species %in% species[value != 0])
site species value
1 1 A 1
3 1 C 0
4 1 D 4
5 2 A 0
7 2 C 3
8 2 D 4
9 3 A 0
11 3 C 0
12 3 D 1
Or the same approach with dplyr
filter
library(dplyr)
df1 %>%
filter(species %in% species[value != 0])
site species value
1 1 A 1
2 1 C 0
3 1 D 4
4 2 A 0
5 2 C 3
6 2 D 4
7 3 A 0
8 3 C 0
9 3 D 1
CodePudding user response:
Filter all rows where the sum of the values in each group is not equal to 0, e.g.
library(dplyr)
df1 %>%
group_by(species) %>%
filter(sum(as.numeric(value)) != 0)