I have tried the dplyr solution in R Set Column Value based on other Column Values, but when I run it I get the columns all repeated as THING$THING1 etc when I need just one column that flags as 1, when any of the columns THING1:THING4 contain a 1, and 0 if none of them do. (The data is using 1 for yes as the answer to a series of related questions, and 0 for no.)
Thing1 | Thing2 | Thing3 | Thing4 |
---|---|---|---|
0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 |
0 | 1 | 0 | 1 |
0 | 0 | 0 | 0 |
0 | 0 | 1 | 0 |
1 | 0 | 0 | 0 |
0 | 0 | 0 | 0 |
And I want the to get the column:
Thing |
---|
0 |
1 |
1 |
0 |
1 |
1 |
0 |
The code as I'm using it is:
Things <- dataset %>%
select(c(THING1:THING4)) %>%
mutate(THING = across(.cols = THING1:THING4,
.fns = ~ if_else(.x == 1|is.na(.x),
1,
0)))
I am using a vector as the real data has about a dozen columns to check.
CodePudding user response:
Here is one potential solution:
library(dplyr)
df <- read.table(text = "Thing1 Thing2 Thing3 Thing4
0 0 0 0
1 0 0 0
0 1 0 1
0 0 0 0
0 0 1 0
1 0 0 0
0 0 0 0", header = TRUE)
df %>%
mutate(flag = as.numeric(if_any(starts_with("Thing"), ~.x == 1)))
#> Thing1 Thing2 Thing3 Thing4 flag
#> 1 0 0 0 0 0
#> 2 1 0 0 0 1
#> 3 0 1 0 1 1
#> 4 0 0 0 0 0
#> 5 0 0 1 0 1
#> 6 1 0 0 0 1
#> 7 0 0 0 0 0
Created on 2022-07-26 by the reprex package (v2.0.1)
Edit
In the code in your question I see you can have NAs too. If you want to 'ignore' NA's you could use:
df %>%
mutate(flag = as.numeric(if_any(starts_with("Thing"), ~.x == 1 & !is.na(.x))))
#> Thing1 Thing2 Thing3 Thing4 flag
#> 1 0 0 0 0 0
#> 2 1 0 0 0 1
#> 3 0 1 0 1 1
#> 4 0 0 0 NA 0
#> 5 0 0 1 NA 1
#> 6 1 0 0 0 1
#> 7 0 0 0 0 0
CodePudding user response:
Your code is making the same operation on several columns, instead of summarizing several columns. If you want to use across
(rather than if_any
) you could make use of rowSums
.
df |>
mutate(Thing = as.numeric(rowSums(across(Thing1:Thing4), na.rm = TRUE) >= 1))
If you want to adapt your own implementation, you could use rowwise()
(and any()
):
df |>
#select(c(Thing1:Thing4)) |>
rowwise() |>
mutate(Thing = if_else(any(c_across(Thing1:Thing4) == 1) | any(is.na(c_across(Thing1:Thing4))), 1, 0)) |>
ungroup()
Data from @jared_mamrot