Home > Software design >  Using names of sorted variables within mutate and ifelse
Using names of sorted variables within mutate and ifelse

Time:10-27

I have following example data:

id <- c(1, 2, 3)
ex3 <- c(0.8,   0.2, 0.3)
ex2 <- c(0.1,   0.4, 0.04)
ex1 <- c(0.04,  0.3, 0.5)
ex <- c(1, 1, 1)
ran <- c(0.5, 0.7, 0.6)
dat <- data.frame(id, ex1, ex2, ex3, ex, ran)

dat

  id  ex1  ex2 ex3 ex ran
1  1 0.04 0.10 0.8  1 0.5
2  2 0.30 0.40 0.2  1 0.7
3  3 0.50 0.04 0.3  1 0.6

I want to modify variable "ex" using following code with dplyr/tidyr:

library(dplyr)
library(tidyr)

dat %>% 
  pivot_longer(
    cols = ex1:ex3
  ) %>% 
  arrange(id, desc(value)) %>% 
  group_by(id) %>% 
  mutate(ex = ifelse(ran <= value[1] & ran > sum(value[2], value[3]), 5, ex)) %>% 
  pivot_wider(
    names_from=name
  )

# A tibble: 3 x 6
# Groups:   id [3]
     id    ex   ran   ex3   ex2   ex1
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     5   0.5   0.8  0.1   0.04
2     2     1   0.7   0.2  0.4   0.3 
3     3     1   0.6   0.3  0.04  0.5

Is it possible to use the names of "ex1"-"ex3" as new values for "ex" instead of "5" within the ifelse-statement in mutate? Example: Using the names of the ex$-variables as new values leads to this output:

  id ex3  ex2  ex1  ex ran
1  1 0.8 0.10 0.04 ex3 0.5
2  2 0.2 0.40 0.30   1 0.7
3  3 0.3 0.04 0.50   1 0.6

Or using the number of the ex$-variables leads to this output:

  id ex3  ex2  ex1  ex ran
1  1 0.8 0.10 0.04   3 0.5
2  2 0.2 0.40 0.30   1 0.7
3  3 0.3 0.04 0.50   1 0.6

Or if I want the lowest value as new value for "ex" (because it is "ex2"):

  id ex3  ex2  ex1  ex ran
1  1 0.8 0.10 0.04   1 0.5
2  2 0.2 0.40 0.30   1 0.7
3  3 0.3 0.04 0.50   1 0.6

To sum it up: I want to refer to the variable-names of the sorted "ex$"-values to create new values for "ex" within ifelse in mutate.

CodePudding user response:

One way could be using parse_number from readr package that extracts the numbers from ex1, ex2, ex3. Depending on the logic you can do:

parse_number(name[1]) here 1 is the position in the column you can use 2 or 3 dependig what fits best your logic.

library(dplyr)
library(tidyr)
library(readr)

dat %>% 
  pivot_longer(
    cols = ex1:ex3
  ) %>% 
  arrange(id, desc(value)) %>% 
  group_by(id) %>% 
  mutate(ex = ifelse(ran <= value[1] & ran > sum(value[2], value[3]), parse_number(name[3]), ex)) %>% 
  pivot_wider(
    names_from=name
  )

   id    ex   ran   ex1   ex2   ex3
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     3   0.5   0.8  0.1   0.04
2     2     1   0.7   0.2  0.4   0.3 
3     3     1   0.6   0.3  0.04  0.5 

For full name:

mibrary(dplyr)
library(tidyr)
library(readr)

dat %>% 
  pivot_longer(
    cols = ex1:ex3
  ) %>% 
  arrange(id, desc(value)) %>% 
  group_by(id) %>% 
  mutate(ex = ifelse(ran <= value[1] & ran > sum(value[2], value[3]), name[1], as.character(ex))) %>% 
  pivot_wider(
    names_from=name
  )
     id ex      ran   ex1   ex2   ex3
  <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1     1 ex1     0.5   0.8  0.1   0.04
2     2 1       0.7   0.2  0.4   0.3 
3     3 1       0.6   0.3  0.04  0.5 
  • Related