Sample vectors from a larger vector in R-CodePudding

I have a two-column data.frame that looks a little like this:

df <- data.frame(Name = rep(paste(letters[1:12],1:12,sep = ""),1),Group = 1:3)

What I would like to do is to randomly select, for example, 2 random values (without replacement) from 'Name' and store them in a character vector. Then select two other values, and store them in another vector, and so on. The requirement is that the values sampled from 'Name' must have the same value in 'Group'.

Is there a fast way of doing this? I could manually create vectors based in a sample of n=2, then update the contents of the original df, and sample again. But I would love to see someone suggesting a more elegant version. Maybe if I store the sampled values in a list?

Thanks in advance.

CodePudding user response：

A base R option using by sample

> with(df,  by(Name, Group, sample, 2))
Group: 1
[1] "g7" "d4"
------------------------------------------------------------
Group: 2
[1] "b2"  "k11"
------------------------------------------------------------
Group: 3
[1] "i9" "f6"

or a more compact outcome coming from aggregate

> aggregate(. ~ Group, df, sample, 2, simplify = FALSE)
  Group    Name
1     1 j10, a1
2     2 k11, b2
3     3 l12, c3

CodePudding user response：

You can use slice_sample:

library(dplyr)
df %>% 
  group_by(Group) %>% 
  slice_sample(n = 2)

  Name  Group
  <chr> <int>
1 a1        1
2 j10       1
3 e5        2
4 b2        2
5 c3        3
6 l12       3

or group_map to get a list:

library(dplyr)
df %>% 
  group_by(Group) %>% 
  group_map(~ sample(.x$Name, 2))

[[1]]
[1] "d4" "a1"

[[2]]
[1] "b2" "e5"

[[3]]
[1] "c3" "i9"

or in base R:

split(df$Name, df$Group) |>
  lapply(function(x) sample(x, 2))

CodePudding user response：

Using data.table

library(data.table)
setDT(df)[df[, sample(.I, 2), Group]$V1]
     Name Group
   <char> <int>
1:    j10     1
2:     g7     1
3:     b2     2
4:    k11     2
5:     i9     3
6:     c3     3