I'm having a problem where I want to mutate two variables with values 0, 1 and NA into a new variable with the sum of 0 and 1, however, R in my case counts NA as 0 or return only NA. Are there an easy fix to this, to exclude the NA?
I am using an R-textbook that does not adress my specific problem.
Code I have tried:
(1)
library(tidyverse)
df <- df |>
mutate((naked_man = naked_fj naked_naked), na.rm = TRUE)
Returns all OBS as NA
Data:
naked_fj | naked_naked | naked_man (problem VAR) |
---|---|---|
0 | 0 | NA |
1 | 0 | NA |
NA | 1 | NA |
0 | NA | NA |
CodePudding user response:
you are just setting it up incorrectly for the mutate
function. You can also use the tidyr::drop_na
to remove the NA
values in the data frame.
library(tidyverse)
df <- data.frame(naked_fj = c(0,1, NA, 0),
naked_naked = c(0, 0, 1, NA))
df <- df |>
mutate(naked_man = naked_fj naked_naked) %>%
drop_na()
RESULT:
naked_fj naked_naked naked_man
1 0 0 0
2 1 0 1
CodePudding user response:
To sum across columns excluding the NA
, one implementation of your code in dplyr
is to use rowwise
:
df |>
rowwise() |>
mutate(naked_man = sum(c(naked_fj, naked_naked), na.rm = TRUE))
# naked_fj naked_naked naked_man
# <dbl> <dbl> <dbl>
# 1 0 0 0
# 2 1 0 1
# 3 NA 1 1
# 4 0 NA 0
But if not needing to use dplyr
, base R may be easier:
df$naked_man <- rowSums(df, na.rm = TRUE)
Data:
df <- read.table(text = "naked_fj naked_naked naked_man
0 0 NA
1 0 NA
NA 1 NA
0 NA NA", header = TRUE)
df <- df[,-3]
df[] <- lapply(df[], as.numeric)