str1<-c("A","B","C","D","E","F")
str2<-c("Apple", "Mango", "Avocado", "Watermelon", "Banana", "Pineapple")
str3<-c("Mouse","Cat", "Lion", "Shark", "Eagle", "Ladybug")
num1<-c(1:6)
num2<-c(2.3, 3.5, 4, 7, 6.2, 3)
binary1<-c(0,1,0,1,0,0)
binary2<-c(1,1,0,0,0,1)
mydata<-data.frame(str1,str2, str3,num1,num2, binary1, binary2)
It is always said that a vectorization is a better way than a looping.
So I am wondering how to recode a lot of variables by vectorization instead of using loops:
My first task is to change str1
, str2
and str3
in factor, and I used:
for (i in c("str1","str2","str3"){
mydata[i]<-as.factor (mydata[i])
}
My second task is to change variables binary1
and binary2
in factor and change their values in 0=No
, 1= Yes
. I used:
for (i in c("binary1","binary2"){
mydata[i]<-factor (mydata[i], levels=c(0,1), labels=c("No","Yes"))
}
How to use vectorization instead of loop in each case.
CodePudding user response:
For example, by using dplyr
:
library(dplyr)
mydata %>%
mutate(across(c(1:3,6:7), ~as.factor(.)),
across(starts_with("bin"), ~ifelse(. == 1, "Yes", "No")))
str1 str2 str3 num1 num2 binary1 binary2
1 A Apple Mouse 1 2.3 No Yes
2 B Mango Cat 2 3.5 Yes Yes
3 C Avocado Lion 3 4.0 No No
4 D Watermelon Shark 4 7.0 Yes No
5 E Banana Eagle 5 6.2 No No
6 F Pineapple Ladybug 6 3.0 No Yes
CodePudding user response:
You can use the map()
function from purrr
.
# Change str1, str2 and str3 into factors using the map() function
mydata[, c("str1", "str2", "str3")] <-
purrr::map(mydata[, c("str1", "str2", "str3")],
.f = as.factor)
str(mydata)
# Change variables binary1 and binary2 in factor and change their values in 0 = No, 1 = Yes using the map() function
mydata[, c("binary1", "binary2")] <-
purrr::map(mydata[, c("binary1", "binary2")],
.f = factor, levels = c(0, 1), labels = c("No", "Yes"))
str(mydata)
'data.frame': 6 obs. of 7 variables:
$ str1 : Factor w/ 6 levels "A","B","C","D",..: 1 2 3 4 5 6
$ str2 : Factor w/ 6 levels "Apple","Avocado",..: 1 4 2 6 3 5
$ str3 : Factor w/ 6 levels "Cat","Eagle",..: 5 1 4 6 2 3
$ num1 : int 1 2 3 4 5 6
$ num2 : num 2.3 3.5 4 7 6.2 3
$ binary1: num 0 1 0 1 0 0
$ binary2: num 1 1 0 0 0 1
CodePudding user response:
Please, find below one alternative solution using data.table
- Code
library(data.table)
sel_cols1 <- c("str1", "str2", "str3")
sel_cols2 <- c("binary1", "binary2")
setDT(mydata)[, (sel_cols1) := lapply(.SD, as.factor), .SDcols = sel_cols1
][, (sel_cols2) := lapply(.SD, function(x) as.factor(fifelse(x == 0, "No", "Yes"))), .SDcols = sel_cols2][]
- Output
#> str1 str2 str3 num1 num2 binary1 binary2
#> 1: A Apple Mouse 1 2.3 No Yes
#> 2: B Mango Cat 2 3.5 Yes Yes
#> 3: C Avocado Lion 3 4.0 No No
#> 4: D Watermelon Shark 4 7.0 Yes No
#> 5: E Banana Eagle 5 6.2 No No
#> 6: F Pineapple Ladybug 6 3.0 No Yes
- Check of
class
variables
sapply(mydata,class)
#> str1 str2 str3 num1 num2 binary1 binary2
#> "factor" "factor" "factor" "integer" "numeric" "factor" "factor"
Created on 2021-11-16 by the reprex package (v2.0.1)