I have a data frame
df<-data.frame(id=rep(1:10,each=10),
Room1=rnorm(100,0.4,0.5),
Room2=rnorm(100,0.3,0.5),
Room3=rnorm(100,0.7,0.5))
I want to mutate Room1 column by group (those in id = 10) using case_when:
data <- df %>%
mutate(Room1 = case_when(
id==10 ~ 0.6,
TRUE ~ as.numeric(Room1)
))
But only for 20% of the rows for id=10. The 20% should be randomly assigned. Can anyone help? Thanks in advance
CodePudding user response:
Group by id
, and use dplyr::percent_rank(runif(n())) <= .2
to select a random 20% of cases within id
.
I assume you intend to add more conditions to your case_when()
-- otherwise, you can use if_else()
instead.
set.seed(13)
library(dplyr)
data <- df %>%
group_by(id) %>%
mutate(Room1 = case_when(
id == 10 & percent_rank(runif(n())) <= .2 ~ 0.6,
TRUE ~ Room1
)) %>%
ungroup()
tail(data, 10)
# A tibble: 10 × 4
id Room1 Room2 Room3
<int> <dbl> <dbl> <dbl>
1 10 0.590 0.801 0.745
2 10 0.117 0.517 -0.491
3 10 -0.207 0.533 2.15
4 10 -0.282 -0.249 0.828
5 10 0.6 0.605 0.778
6 10 0.272 0.308 0.0575
7 10 -0.213 0.668 0.476
8 10 0.507 0.923 -0.0948
9 10 0.434 -0.0663 0.0720
10 10 0.6 0.264 0.647
CodePudding user response:
A dplyr
solution:
library(dplyr)
df %>%
group_by(id) %>%
mutate(Room1 = case_when(
id == 10 & sample(n()) <= n()*0.2 ~ 0.6,
TRUE ~ Room1
)) %>%
ungroup()
Output
# A tibble: 100 × 4
id Room1 Room2 Room3
<int> <dbl> <dbl> <dbl>
...
91 10 0.132 -0.595 0.390
92 10 0.258 -0.0995 0.580
93 10 0.239 0.503 0.960
94 10 0.6 0.789 0.744
95 10 0.878 0.308 1.21
96 10 1.24 0.523 1.73
97 10 -0.0795 -0.263 0.546
98 10 0.6 -0.224 0.695
99 10 -0.194 0.524 -0.167
100 10 0.665 0.639 -0.0578