Home > Enterprise >  How to select a sample from a data frame and then remove it from the data frame in r?
How to select a sample from a data frame and then remove it from the data frame in r?

Time:06-09

I have a data frame df with 500 rows and 6 columns With s <- sample_n(df, 100) I get 100 random rows of it. I then want to sample 100 rows from the remaining 400. How can I modify my initial data frame that the 100 rows I selected are removed? I've read on similar questions df[-s] but at least in this case that doesnt work.

CodePudding user response:

Here's a solution with anti_join:

Data:

set.seed(12)
df <- data.frame(
  x = rnorm(100)
)

Procedure:

library(dplyr)
df %>%
  # take sample:
  sample_n(10) %>%
  # subtract sample from dataframe:
  anti_join(df, .)
Joining, by = "x"
              x
1  -1.480567595
2   1.577169472
3  -0.956744479
4  -1.997642097
5  -0.272296044
6  -0.315348711
7  -0.628255237
8  -0.106463885
9   0.428014802
10 -1.293882298
11 -0.779566508
12  0.011951759
13 -0.703464254
14  0.340512271
15  0.506968172
16 -0.293305149
17  0.223641415
18  2.007201457
19  1.011979118
20 -0.302459245
21 -1.025244839
22 -0.267384830
23 -0.199105661
24  0.131122595
25  0.145799896
26  0.362064721
27  0.673981164
28  2.072035768
29 -0.541028649
30 -1.070492158
31 -0.372456732
32 -0.485141355
33  0.274784178
34 -0.479512562
35  0.798105326
36 -1.004451202
37  0.578134627
38 -1.595625656
39 -0.308503656
40  0.449465922
41 -0.977053283
42  0.189997859
43  0.731453357
44 -0.492599111
45 -0.042684912
46 -0.112670576
47  0.456827248
48  2.020334842
49 -1.050890062
50  0.734652106
51  0.539249744
52 -1.314272797
53 -0.250038722
54  0.314204596
55  0.406546694
56  0.994420600
57  0.855768432
58  0.197128917
59  0.834325038
60  0.846790152
61  1.954105255
62 -2.149260002
63  0.971120270
64  1.145061573
65 -0.525400626
66  0.250320103
67 -0.429406611
68 -0.182519622
69 -0.103310466
70 -0.633838203
71 -1.271053787
72 -0.383950394
73  0.516755802
74 -0.177968544
75  0.004258039
76 -1.274059551
77 -0.202110338
78  1.164465880
79 -0.023379409
80 -0.176724455
81  1.113709078
82 -0.541888860
83 -0.963398332
84  0.376448400
85  0.129262533
86 -0.342289274
87  0.452281257
88 -0.694737942
89 -0.239013591
90 -1.007298960
  • Related