I have a two-column data.frame that looks a little like this:
df <- data.frame(Name = rep(paste(letters[1:12],1:12,sep = ""),1),Group = 1:3)
What I would like to do is to randomly select, for example, 2 random values (without replacement) from 'Name' and store them in a character vector. Then select two other values, and store them in another vector, and so on. The requirement is that the values sampled from 'Name' must have the same value in 'Group'.
Is there a fast way of doing this? I could manually create vectors based in a sample of n=2, then update the contents of the original df, and sample again. But I would love to see someone suggesting a more elegant version. Maybe if I store the sampled values in a list?
Thanks in advance.
CodePudding user response:
A base R option using by
sample
> with(df, by(Name, Group, sample, 2))
Group: 1
[1] "g7" "d4"
------------------------------------------------------------
Group: 2
[1] "b2" "k11"
------------------------------------------------------------
Group: 3
[1] "i9" "f6"
or a more compact outcome coming from aggregate
> aggregate(. ~ Group, df, sample, 2, simplify = FALSE)
Group Name
1 1 j10, a1
2 2 k11, b2
3 3 l12, c3
CodePudding user response:
You can use slice_sample
:
library(dplyr)
df %>%
group_by(Group) %>%
slice_sample(n = 2)
Name Group
<chr> <int>
1 a1 1
2 j10 1
3 e5 2
4 b2 2
5 c3 3
6 l12 3
or group_map
to get a list:
library(dplyr)
df %>%
group_by(Group) %>%
group_map(~ sample(.x$Name, 2))
[[1]]
[1] "d4" "a1"
[[2]]
[1] "b2" "e5"
[[3]]
[1] "c3" "i9"
or in base R:
split(df$Name, df$Group) |>
lapply(function(x) sample(x, 2))
CodePudding user response:
Using data.table
library(data.table)
setDT(df)[df[, sample(.I, 2), Group]$V1]
Name Group
<char> <int>
1: j10 1
2: g7 1
3: b2 2
4: k11 2
5: i9 3
6: c3 3