Home > Enterprise >  Create a columns conditioned to two possible levels contained in another column
Create a columns conditioned to two possible levels contained in another column

Time:11-30

I have the following dataset

out 
# A tibble: 1,356 x 7
      ID GROUP    Gender   Age Education tests      score
   <dbl> <chr>     <dbl> <dbl>     <dbl> <chr>      <dbl>
 1     1 TRAINING      1    74        18 ADAS_CogT0  14.7
 2     1 TRAINING      1    74        18 ROCF_CT0    32  
 3     1 TRAINING      1    74        18 ROCF_IT0     3.7
 4     1 TRAINING      1    74        18 ROCF_RT0     3.9
 5     1 TRAINING      1    74        18 PVF_T0      41.3
 6     1 TRAINING      1    74        18 SVF_T0      40  
 7     1 TRAINING      1    74        18 ADAS_CogT7  16  
 8     1 TRAINING      1    74        18 ROCF_CT7    33  
 9     1 TRAINING      1    74        18 ROCF_IT7     1.7
10     1 TRAINING      1    74        18 ROCF_RT7     2.4

If I would like to create a column where in place of the tests ending with T0 would corresponf the value score0 whereas in place of tests ending with T7 the value would be score7`, which are the possible way to fulfill this?

CodePudding user response:

Please be so kind put the data in your posts. >> dput(df)

You could use a combination of case_when and str_detect

library(dplyr)
library(stringr) 

df <- structure(
      list(
    ID = 1:10,
    GROUP = rep('TRAINING', 10),
    Gender = rep(1, 10),
    Education = rep(74, 10),
    test =  c(
      'ADAS_CogT0',
      'ROCF_CT0',
      'ROCF_IT0',
      'ROCF_RT0',
      'PVF_T0',
      'SVF_T0',
      'ADAS_CogT7',
      'ROCF_CT7',
      'ROCF_IT7',
      'ROCF_RT7'
    ),
    score = c(14.7,32,3.7,3.9,41.3,40,16,33,1.7,2.4)
  ),
  row.names = c(1:10),
  class = "data.frame"
)

df2 <- df %>%
  mutate(new = case_when(str_detect(test, 'T0') ~ 'score0',
                         str_detect(test, 'T7') ~ 'score7',
                         TRUE ~ test)
         )

       ID    GROUP Gender Education       test score    new
1   1 TRAINING      1        74 ADAS_CogT0  14.7 score0
2   2 TRAINING      1        74   ROCF_CT0  32.0 score0
3   3 TRAINING      1        74   ROCF_IT0   3.7 score0
4   4 TRAINING      1        74   ROCF_RT0   3.9 score0
5   5 TRAINING      1        74     PVF_T0  41.3 score0
6   6 TRAINING      1        74     SVF_T0  40.0 score0
7   7 TRAINING      1        74 ADAS_CogT7  16.0 score7
8   8 TRAINING      1        74   ROCF_CT7  33.0 score7
9   9 TRAINING      1        74   ROCF_IT7   1.7 score7
10 10 TRAINING      1        74   ROCF_RT7   2.4 score7

CodePudding user response:

Do you want the output to be string 'score0' and 'score7' ?

You may try -

library(dplyr)

out %>%
  mutate(result = case_when(grepl('T0$', tests) ~ 'score0',
                            grepl('T7$', tests) ~ 'score7'))

#   ID    GROUP Gender Age Education      tests score result
#1   1 TRAINING      1  74        18 ADAS_CogT0  14.7 score0
#2   1 TRAINING      1  74        18   ROCF_CT0  32.0 score0
#3   1 TRAINING      1  74        18   ROCF_IT0   3.7 score0
#4   1 TRAINING      1  74        18   ROCF_RT0   3.9 score0
#5   1 TRAINING      1  74        18     PVF_T0  41.3 score0
#6   1 TRAINING      1  74        18     SVF_T0  40.0 score0
#7   1 TRAINING      1  74        18 ADAS_CogT7  16.0 score7
#8   1 TRAINING      1  74        18   ROCF_CT7  33.0 score7
#9   1 TRAINING      1  74        18   ROCF_IT7   1.7 score7
#10  1 TRAINING      1  74        18   ROCF_RT7   2.4 score7

Or another option with readr::parse_number.

out %>%
  mutate(result = paste0('score', readr::parse_number(tests)))
  • Related