I am trying to create a function that dichotomizes certain defined columns of a data frame based on different values depending on the column.
For example, in the following data frame with conditions A, B, C and D:
A <- c(0, 2, 1, 0, 2, 1, 0, 0, 1, 2)
B <- c(0, 1, 1, 1, 0, 0, 0, 1, 1, 0)
C <- c(0, 0, 0, 1, 1, 1, 1, 1, 1, 1)
D <- c(0, 0, 3, 1, 2, 1, 4, 0, 3, 0)
Data <- data.frame(A, B, C, D)
I would like the function to dichotomize the conditions that I select [e.g. A, B, D] and dichotomize them based on thresholds that I assign [e.g. 2 for A, 1 for B, 3 for D].
I would like the dichotomized columns to be added to the data frame with different names [e.g. A_dich, B_dich, D_dich].
The final data frame should look like this (you will notice B is already dichotomized, which is fine, it should just be treated equally and added):
A B C D A_dicho B_dicho D_dicho
1 0 0 0 0 0 0 0
2 2 1 0 0 1 1 0
3 1 1 0 3 0 1 1
4 0 1 1 1 0 1 0
5 2 0 1 2 1 0 0
6 1 0 1 1 0 0 0
7 0 0 1 4 0 0 1
8 0 1 1 0 0 1 0
9 1 1 1 3 0 1 1
10 2 0 1 0 1 0 0
Could someone help me? Many thanks in advance.
CodePudding user response:
Make a little threshold vector specifying the values, then Map
it to the columns:
thresh <- c("A"=2, "B"=1, "D"=3)
Data[paste(names(thresh), "dicho", sep="_")] <- Map(
\(d,th) as.integer(d >= th), Data[names(thresh)], thresh
)
Data
## A B C D A_dicho B_dicho D_dicho
##1 0 0 0 0 0 0 0
##2 2 1 0 0 1 1 0
##3 1 1 0 3 0 1 1
##4 0 1 1 1 0 1 0
##5 2 0 1 2 1 0 0
##6 1 0 1 1 0 0 0
##7 0 0 1 4 0 0 1
##8 0 1 1 0 0 1 0
##9 1 1 1 3 0 1 1
##10 2 0 1 0 1 0 0