Home > database >  R, Tidyverse: Replace each factor in a data.frame with a randomly drawn value from a corresponding d
R, Tidyverse: Replace each factor in a data.frame with a randomly drawn value from a corresponding d

Time:12-31

I am trying to find a tidyverse solution to the following mapping problem, without success. I begin with a data frame where every variable is a factor.

A B C
1 1 1
2 2 1
3 2 1

Each factor value corresponds to a distribution of random values, like so. I am trying to randomly map a variable from each corresponding distribution to a value.

one<-rnorm(5)

one

[1]  0.8257975  1.0291827 -0.5708449  0.1112144 -0.2817895

two<-rnorm(2)

two

[1] -2.06849794 -0.78663065  0.02430413

three<-rnorm(1)

three

[1] 0.1309044

This would be an example output, after the mapping takes place. Every factor value has been replaced by a value at random from the corresponding distribution.

      A          B           C
  0.8257975  1.0291827  -0.5708449
-2.06849794 -0.78663065  0.1112144
  0.1309044  0.02430413 -0.2817895

CodePudding user response:

I solution would be to substitute the factors, here an example:

Data

library(dplyr)

data <-
  tibble(
    A = c(1,2,3),
    B = c(1,2,2),
    C = c(1,1,1)
  ) %>% 
  mutate(across(.fns = as.factor))

Code

to_dist <- function(x){
  
  n <- length(x)
  
  case_when(
    x == "1" ~ rnorm(n,mean = 10),
    x == "2" ~ rnorm(n,mean = 100,sd = 10),
    x == "3" ~ rnorm(n,mean = 1),
    TRUE ~ NA_real_
  )
}

data %>% 
  mutate(across(.fns = to_dist))

Output

# A tibble: 3 x 3
       A     B     C
   <dbl> <dbl> <dbl>
1   8.61  10.3 10.9 
2 104.    90.3  9.71
3   1.89 105.   9.26
  • Related