I would like to add an extra column, z
based on the following conditions:
- if
x == "A"
, generate a binary variable assuming the prob of success (=1) is 0.5 - if
x == "C" & y == "N"
, generate a binary variable assuming the prob of success is 0.25.
# Sample data
df <- tibble(
x = ("A", "C", "C", "B", "C", "A", "A"),
y = ("Y", "N", "Y", "N", "N", "N", "Y"))
Currently, my approach uses filter
, then set.seed
and rbinom
, and finally rbind
. But I am looking for a more elegant solution that doesn't involve subseting and re-joining the data.
CodePudding user response:
You may put your logic into a simple if / else
structure and wrap it in a function g()
.
g <- \(z) {
if (z['x'] == 'A') {
rbinom(1, 1, .5)
}
else if (z['x'] == 'C' & z['y'] == 'N') {
rbinom(1, 1, .25)
} else {
NA
}
}
set.seed(42)
transform(df, z=apply(df, 1, g))
# x y z
# 1 A Y 1
# 2 C N 1
# 3 C Y NA
# 4 B N NA
# 5 C N 0
# 6 A N 1
# 7 A Y 1
CodePudding user response:
This is a good case for dplyr::case_when
since you are using tidyverse
functions.
library(dplyr)
set.seed(1)
df %>%
mutate(z = case_when(x == "A" ~ rbinom(n(), 1, 0.5),
x == "C" & y == "N" ~ rbinom(n(), 1, 0.25)))
# A tibble: 7 x 3
# Rowwise:
x y z
<chr> <chr> <int>
1 A Y 0
2 C N 1
3 C Y NA
4 B N NA
5 C N 0
6 A N 0
7 A Y 1
CodePudding user response:
You can try nested ifelse
like below
transform(
df,
z = suppressWarnings(
rbinom(
nrow(df), 1,
ifelse(x == "A", 0.5,
ifelse(x == "C" & y == "N", 0.25, NA)
)
)
)
)
which gives
x y z
1 A Y 1
2 C N 0
3 C Y NA
4 B N NA
5 C N 1
6 A N 1
7 A Y 1