I'd like to create a variable to count the number of unique values in each row for a subset of columns (i.e.,baseline,wave1,wave2,wave3). So far I have the below. I have included an example data set with a variable "example" to show what I am after. I also have included the variable "change", which shows the variable created using the code below.
# Create example data
data <- structure(list(age = c("18", "19", NA, "40", "21", "33", "32",
"34", "43", "22"), baseline = c("1", "1", NA, "4", "1", "3",
"2", "4", "3", "2"), wave1 = c("1", "1", "2", "4", "4", "3",
"2", "4", "3", "2"), wave2 = c("1", "1", "4", "4", NA, "3",
"2", "4", "3", "2"), wave3 = c("1", "2", NA, "4", "4", "3",
"2", "4", "3", "4"), example = c("1", "2", "2", "1", "2", "1",
"1", "1", "1", "2"), change = c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L)), row.names = c(NA, -10L), groups = structure(list(.rows = structure(list(
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame")), class = c("rowwise_df", "tbl_df", "tbl",
"data.frame"))
library(dplyr)
# Create a var for change at any point (ignoring NAs)
data <- data %>%
rowwise() %>% #perform operation by row
mutate(change = length(unique(na.omit(baseline,wave1,wave2,wave3))))
CodePudding user response:
We can use n_distinct
where we can use the na.rm
argument to remove the NA
elements (though in the OP's data, it was "NA"
)
library(dplyr)
data %>%
type.convert(as.is = TRUE) %>%
rowwise %>%
mutate(change = n_distinct(c_across(baseline:wave3), na.rm = TRUE)) %>%
ungroup
-output
# A tibble: 10 × 7
age baseline wave1 wave2 wave3 example change
<int> <int> <int> <int> <int> <int> <int>
1 18 1 1 1 1 1 1
2 19 1 1 1 2 2 2
3 NA NA 2 4 NA 2 2
4 40 4 4 4 4 1 1
5 21 1 4 NA 4 2 2
6 33 3 3 3 3 1 1
7 32 2 2 2 2 1 1
8 34 4 4 4 4 1 1
9 43 3 3 3 3 1 1
10 22 2 2 2 4 2 2
Or a faster option with dapply
from collapse
library(collapse)
data$change <- dapply(slt(ungroup(data), baseline:wave3),
MARGIN = 1, FUN = fndistinct)