df <- data.frame(
id = rep(letters[1:3], 9),
m1 = ceiling(rnorm(9, 10, 3)),
m2 = ceiling(rnorm(9, 10, 6)),
m3 = 0
)
head(df)
id m1 m2 m3
1 a 12 14 0
2 b 11 9 0
3 c 10 10 0
4 a 16 1 0
5 b 5 15 0
6 c 8 7 0
I have a data frame with metadata in the left-most columns and a raw data matrix attached to the right side. I'd like to remove columns that sum to zero on the right side of the dataframe without breaking into two seperate objects using dplyr::select_if
df %>%
select_if(!(grepl("m",names(.)))) %>%
head()
id
1 a
2 b
3 c
4 a
5 b
6 c
When I attempt to add a second term to evaluate whether the raw data columns (indicated by "m" prefix) sum to zero, I get the following error message:
> df %>%
select_if(!(grepl("m",names(.))) || sum(.) > 0)
Error in `select_if()`:
! `.p` is invalid.
✖ `.p` should have the same size as the number of variables in the tibble.
ℹ `.p` is size 1.
ℹ The tibble has 4 columns, including the grouping variables.
Run `rlang::last_error()` to see where the error occurred.
Warning message:
In !(grepl("m", names(.))) || sum(.) > 0 :
'length(x) = 4 > 1' in coercion to 'logical(1)'
> rlang::last_error()
<error/rlang_error>
Error in `select_if()`:
! `.p` is invalid.
✖ `.p` should have the same size as the number of variables in the tibble.
ℹ `.p` is size 1.
ℹ The tibble has 4 columns, including the grouping variables.
I greatly appreciate any assistance with this!
CodePudding user response:
As @akrun already pointed out in the comments select_if()
is deprecated. We can select()
all variables that don't start with "M" !starts_with("M")
and which are numeric and whose sum is larger zero where(~ is.numeric(.x) && sum(.x) > 0)
.
Here the double &
operator is important. We first check if a column is numeric and only in this case the control flow moves on the check if the sum
is greater zero. Without this we will receive an error that we have provided a non-numeric variable to sum()
.
library(dplyr)
df %>%
select(!starts_with("M"),
where(~ is.numeric(.x) && sum(.x) > 0))
#> id m1 m2
#> 1 a 12 18
#> 2 b 13 24
#> 3 c 6 12
#> 4 a 11 8
#> 5 b 9 0
#> 6 c 12 2
#> 7 a 11 9
#> 8 b 12 4
#> 9 c 4 8
#> 10 a 12 18
#> 11 b 13 24
#> 12 c 6 12
#> 13 a 11 8
#> 14 b 9 0
#> 15 c 12 2
#> 16 a 11 9
#> 17 b 12 4
#> 18 c 4 8
#> 19 a 12 18
#> 20 b 13 24
#> 21 c 6 12
#> 22 a 11 8
#> 23 b 9 0
#> 24 c 12 2
#> 25 a 11 9
#> 26 b 12 4
#> 27 c 4 8