Home > Software design >  How to create a function that will change columns values by name conditional to olher variable in R?
How to create a function that will change columns values by name conditional to olher variable in R?

Time:08-03

I am trying to create a code that uses the value of another variable to mutate only a few columns. I am putting a reproducible example.

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(7, 12, 19, 18, 30)
Age_Kid_17 <- NA
Age_Kid_18 <- NA
Age_Kid_20 <- NA


df <- data.frame(Name, Age,Age_Kid_17,Age_Kid_18,Age_Kid_20)

I want to change the values of the columns based on the value of the column 'Age' and use this value to determine what columns I want to change. The following loop does work, but it takes too long on the case that I am working on.

for(i in 1:nrow(df)){
  age_ <- df[i,'Age']
  
  if(age_>21){
    next
  }
  if(age_<17){
    for (a in 17:20){
    df[i,paste0('Age_Kid_',a)] <- 0
    }
  }else{
    for (a in age_:20){
      df[i,paste0('Age_Kid_',a)] <- 0} 
    }
}

CodePudding user response:

Tom's answer is a good option. You could also do it without pivoting in the following way, though the resulting code is a little tricky to follow. Also please note, your loop code creates a column called "Age_Kid_19" that is not present in the input data. Is that intended?

library(tidyverse)

df %>% 
  rowwise() %>% 
  mutate(across(starts_with('Age_Kid'), ~ifelse(Age > parse_number(cur_column()), NA, 0)))

  Name    Age Age_Kid_17 Age_Kid_18 Age_Kid_20
  <chr> <dbl>      <dbl>      <dbl>      <dbl>
1 Jon       7          0          0          0
2 Bill     12          0          0          0
3 Maria    19         NA         NA          0
4 Ben      18         NA          0          0
5 Tina     30         NA         NA         NA

CodePudding user response:

Is this what you're looking for?

library(tidyverse) 

df %>% 
  pivot_longer(-c(Name, Age)) %>% 
  mutate(value = case_when(Age < parse_number(name) ~ 0, 
                           TRUE ~ NA_real_)) %>% 
  pivot_wider(names_from = name, 
              values_from = value)

# A tibble: 5 x 5
  Name    Age Age_Kid_17 Age_Kid_18 Age_Kid_20
  <chr> <dbl>      <dbl>      <dbl>      <dbl>
1 Jon       7          0          0          0
2 Bill     12          0          0          0
3 Maria    19         NA         NA          0
4 Ben      18         NA         NA          0
5 Tina     30         NA         NA         NA
  • Related