I have a dataframe:
dat <- data.frame(col1 = sample(0:3, 10, replace = TRUE),
col2 = sample(0:3, 10, replace = TRUE),
col3 = sample(0:3, 10, replace = TRUE),
col4 = sample(0:3, 10, replace = TRUE))
I want to create a new vector (outside of the dataframe) var
that will state 1
if the sum of col3
and col4
is >= 4 and 0 otherwise. How can I do this? I tried using sum
within an ifelse
statement but it seems to produce a character
output.
Any leads? Thanks!
CodePudding user response:
In a more general way, you can also go the apply route with all sorts of further logic included in the defined function should such be needed...
apply(dat,1,FUN=function (x) {as.integer(sum(x[3:4], na.rm=TRUE)>= 4)})
CodePudding user response:
With dplyr
, we can use mutate
to create a new column (var
) using rowSums
and the condition of whether the sum of col3
and col4
is greater than or equal to 4. Here, I use
to convert from logical to 0 or 1. Then, we can use pull
to get the vector for var
.
library(tidyverse)
var <- dat %>%
mutate(var = (rowSums(select(., c(col3:col4)), na.rm = TRUE) >= 4)) %>%
pull(var)
Output
[1] 1 1 1 0 0 1 1 1 0 0
Or another option is to use sum
with c_across
for each row:
var <- dat %>%
rowwise() %>%
mutate(var = (sum(c_across(col3:col4), na.rm = TRUE) >= 4)) %>%
pull(var)
CodePudding user response:
If there are NA
s as well, then use rowSums
with na.rm = TRUE
vec1 <- as.integer(rowSums(dat[3:4], na.rm = TRUE) >= 4)