Suppose I have a tibble in which each row is a set of probabilities that add up to 1. For example,
probs <- tibble(A = c(0.1, 0.5, 0.6),
B = c(0.5, 0.2, 0.1),
C = c(0.4, 0.3, 0.3))
probs
# A tibble: 3 x 3
A B C
<dbl> <dbl> <dbl>
1 0.1 0.5 0.4
2 0.5 0.2 0.3
3 0.6 0.1 0.3
I'd like to create a column with mutate
that randomly selects a letter based on the probabilities in each row. This would be my best guess:
probs %>%
mutate(random_outcome = sample(LETTERS[1:3], size = 1, prob = c(A, B, C)))
But this generates an error:
Error: Problem with `mutate()` column `random_outcome`.
i `random_outcome = sample(LETTERS[1:3], size = 1, prob = c(A, B, C))`.
x incorrect number of probabilities
Run `rlang::last_error()` to see where the error occurred.
CodePudding user response:
Use rowwise in combination with c_across:
library(tidyverse)
set.seed(1)
probs %>%
rowwise() %>%
mutate(random_outcome = sample(LETTERS[1:3], size = 1, prob = c_across(c(A, B, C)))) %>%
ungroup()
# A tibble: 3 x 4
A B C random_outcome
<dbl> <dbl> <dbl> <chr>
1 0.1 0.5 0.4 B
2 0.5 0.2 0.3 A
3 0.6 0.1 0.3 A