I have the following data frame Data
:
a | b | c |
---|---|---|
1 | 1 | 2 |
2 | NA | 4 |
NA | 3 | NA |
NA | NA | NA |
I want to calculate the sum of columns a to c using dplyr
s mutate
function and sum
to get the following:
a | b | c | d |
---|---|---|---|
1 | 1 | 2 | 5 |
2 | NA | 4 | 6 |
NA | 3 | NA | 3 |
NA | NA | NA | NA |
The important thing is for NAs to be treated like 0 basically except when they are all NA then it will return the sum as NA.
I think I can do this:
Data<-Data %>%
mutate(d=sum(a,b,c,na.rm=TRUE))
The issue is I dont want to list all the variables a
b
and c
, but want to make use of the :
functionality so that I can list the variables like this a:c
.
I was thinking of writing the code like this:
Data<-Data %>%
mutate(d=sum(a:c,na.rm=TRUE))
but this doesn't work. I don't know if I should put a quotation around a and around c?
I swear I've done this million of times before, but I can't find previous code I've used this in and for some reason my brain froze today.
CodePudding user response:
Just keep it simple:
df %>%
mutate(
d = rowSums(across(everything()), na.rm =T),
d = na_if(d, 0)
)
# A tibble: 4 x 4
a b c d
<int> <int> <int> <dbl>
1 1 1 2 4
2 2 NA 4 6
3 NA 3 NA 3
4 NA NA NA NA
CodePudding user response:
You can change everything for any tidyverse selector like matches or starts_with
library(tidyverse)
df_example <- tibble::tribble(
~a, ~b, ~c,
1L, 1L, 2L,
2L, NA, 4L,
NA, 3L, NA,
NA, NA, NA
)
df_example |>
rowwise() |>
mutate(d = c_across(everything()) |> sum(na.rm = TRUE)) |>
ungroup()
#> # A tibble: 4 x 4
#> a b c d
#> <int> <int> <int> <int>
#> 1 1 1 2 4
#> 2 2 NA 4 6
#> 3 NA 3 NA 3
#> 4 NA NA NA 0
Created on 2021-12-10 by the reprex package (v2.0.1)
A less general solution is to use a rowwise function like rowSums
library(tidyverse)
df_example <- tibble::tribble(
~a, ~b, ~c,
1L, 1L, 2L,
2L, NA, 4L,
NA, 3L, NA,
NA, NA, NA
)
df_example |>
mutate(d = across(everything()) |> rowSums(na.rm = TRUE))
#> # A tibble: 4 x 4
#> a b c d
#> <int> <int> <int> <dbl>
#> 1 1 1 2 4
#> 2 2 NA 4 6
#> 3 NA 3 NA 3
#> 4 NA NA NA 0
Created on 2021-12-10 by the reprex package (v2.0.1)
CodePudding user response:
Like so:
df %>%
rowwise() %>%
mutate(d=sum(across(a:c),na.rm=TRUE)) %>%
ungroup()
Also rowSums
could be an option