Home > Software design >  Create variables in a for loop using character data
Create variables in a for loop using character data

Time:03-15

I have a data frame with a column ("name") that contains names of fruits:

 name
 Apple
 Apple
 Mango
 Banana
 Banana
 Orange
 Mango
 Orange

.... And so on. I have 9 fruits in my data

I want to create new variables following the naming rule "name_'data'". So, I want to add 9 more variables such that:

 name     name_Apple      name_Mango     name_Banana    name_Orange
 Apple       1                0              0              0
 Apple       1                0              0              0
 Mango       0                1              0              0
 Banana      0                0              1              0
 Banana      0                0              1              0
 Orange      0                0              0              1
 Mango       0                1              0              0
 Orange      0                0              0              1

I want to use a for loop to do this since data will be added to the existing frame. I have tried this:

name_list <- c("Apple", "Mango", "Banana", "Orange)
for (i in name_list) {
  df_main$name_[[i]] <- ifelse(df_main$name == [[i]], 1, 0)
  
}

I get the error "Error: unexpected '[['". I think I'm referencing character data wrong in the loop, but can't figure out how to do it correctly. Will mutate() work better here?

CodePudding user response:

We can use dummy_cols from fastDummies

library(fastDummies)
df1 %>%
    dummy_cols('name')

-output

    name name_Apple name_Banana name_Mango name_Orange
1  Apple          1           0          0           0
2  Apple          1           0          0           0
3  Mango          0           0          1           0
4 Banana          0           1          0           0
5 Banana          0           1          0           0
6 Orange          0           0          0           1
7  Mango          0           0          1           0
8 Orange          0           0          0           1

data

df1 <- structure(list(name = c("Apple", "Apple", "Mango", "Banana", 
"Banana", "Orange", "Mango", "Orange")), class = "data.frame", row.names = c(NA, 
-8L))

CodePudding user response:

In base R, you can do:

mat <- outer(df$name, unique(df$name), function(a, b) as.numeric(a == b))
cbind(df, setNames(as.data.frame(mat), paste0('name_', unique(df$name))))

#>     name name_Apple name_Mango name_Banana name_Orange
#> 1  Apple          1          0           0           0
#> 2  Apple          1          0           0           0
#> 3  Mango          0          1           0           0
#> 4 Banana          0          0           1           0
#> 5 Banana          0          0           1           0
#> 6 Orange          0          0           0           1
#> 7  Mango          0          1           0           0
#> 8 Orange          0          0           0           1 

CodePudding user response:

Another way:

model.matrix(~ name - 1, data = df)

#     nameApple nameBanana nameMango nameOrange
# 1         1          0         0          0
# 2         1          0         0          0
# 3         0          0         1          0
# 4         0          1         0          0
# 5         0          1         0          0
# 6         0          0         0          1
# 7         0          0         1          0
# 8         0          0         0          1

data:

structure(list(name = c("Apple", "Apple", "Mango", "Banana", 
"Banana", "Orange", "Mango", "Orange")), class = "data.frame", row.names = c(NA, 
-8L)) -> df
  • Related