I want to choose 100 houses randomly from my dataset, and find the mean value of their total price. Then repeat this action 100 times, and for each time I repeat the action, calculate the mean price. And then plot all the mean values in a histogram. This is my code (rome is the house dataset):
run <- rome[sample(1:nrow(rome), 100, replace=FALSE),]
dun <- mean(run$PRICE)
c <- replicate(100, dun)
I also tried the for loop, which I'm pretty sure I need to use here, but there are mistakes in my code:
d <- for(i in 1:100){
run <- rome[sample(1:nrow(rome), 100, replace=FALSE),]
dun <- mean(run$PRICE)
c <- replicate(100, dun)
}
And finally hist(d)
, which doesn't run because of the mistakes. Can you help me?
The data (price values):
good_struct <-
c(
47,
113,
165,
104.3,
62.5,
70,
127.5,
64.5,
145,
63.5,
58.9,
65,
48,
3.5,
12.8,
17.5,
36,
41.9,
53.5,
24.5,
24.5,
55.5,
60,
51,
46,
46,
44,
54.9,
42.5,
44,
44.9,
37.9,
33,
43.9,
49.6,
52,
37.5,
50,
35.9,
42.9,
107,
112,
44.9,
55,
102,
35.5,
62.9,
39,
110,
8,
62,
85.9,
57,
110,
67.7,
89.5,
70,
74,
13,
48,
24,
53.5,
34.5,
53,
87.5,
33.5,
24,
9.6,
30,
41,
30,
38.9,
20.7,
49.9,
18.6,
39,
34,
16,
18.9,
15.2,
41.5,
53,
22,
24.9,
6.7,
32.5,
30,
59,
29.5,
26,
16.5,
39,
48.9,
33.5,
46,
54,
57.9,
37.9,
32,
31,
34,
29,
32.5,
51.9,
31,
41.8,
48,
28,
35,
46.5,
51.9,
35.4,
16,
35,
35,
36.5,
35.9,
45,
40,
35,
38,
37,
23,
25.5,
39.5,
21.5,
9,
67.5,
13.4,
12.5,
28.5,
23,
33.5,
9,
11,
30.9,
31.65,
33,
33.4,
47,
40,
46,
45.5,
57,
29.9,
30,
34,
51,
64.5,
57.5,
85.5,
61,
38,
56.5,
60.4,
51.5,
54,
69,
56,
27.9,
37.5,
32.9,
22,
29.9,
39.9,
32.6,
38.5,
21.5,
25.9,
27.5,
22.9,
31.5,
8.5,
5.5,
33,
57,
47,
43.5,
43.9,
68.5,
44.25,
61,
40,
44.5,
57,
35,
35.1,
64.5,
40,
42.6,
50,
58,
58,
55,
43,
54,
39,
45,
42,
38.9,
43.215,
26.5,
30,
29.5
)
CodePudding user response:
Perhaps something like this?
rome <- data.frame(PRICE = rnorm(1e6,3e5,5e4),
ID = 1:1e6)
dun = NULL
for(i in 1:100){
run <- rome[sample(1:nrow(rome), 100, replace=FALSE),]
dun <- c(dun, mean(run$PRICE))
}
hist(dun)
CodePudding user response:
Since replicate
is a wrapper to sapply
, consider adjusting the call by passing in an expression that subsets a vector then calls mean
:
random_mean_prices <- replicate(
100, mean(rome$PRICE[sample(1:nrow(rome), 100, replace=FALSE)])
)
hist(random_mean_prices)