What is the frequency of getting the same face of a coin in a row consecutively when you flip a single coin 1000 times? What are the relative frequencies of this experiment ending at 2, 3, 4, and 5 tosses?
I started with these codes but I don't know how to continue:
coin <- c("heads", "tails")
num_flips <- 10000
flips <- sample(coin, size = num_flips, replace = TRUE)
freqs <- table(flips)
CodePudding user response:
The secret here is to use run length encoding (rle
), which will tell you the length of consecutive flips of the same result.
set.seed(1) # Makes example reproducible
coin <- c("heads", "tails")
num_flips <- 10000
flips <- sample(coin, size = num_flips, replace = TRUE)
RLE <- rle(flips)
If we examine the RLE
object it will show us the number of consecutive heads and tails:
RLE
#> Run Length Encoding
#> lengths: int [1:4912] 1 1 2 1 3 2 5 4 7 1 ...
#> values : chr [1:4912] "heads" "tails" "heads" "tails" "heads" "tails" "heads" ...
If we table the lengths
element, we will get the number of runs of each length:
runs <- table(RLE$lengths)
runs
#> 1 2 3 4 5 6 7 8 9 10 11 12 14 16
#> 2431 1224 581 332 180 84 45 17 10 1 1 2 2 2
If we want to know the proportion of runs that are of each length, we can do:
runs / sum(runs)
#> 1 2 3 4 5 6
#> 0.4949104235 0.2491856678 0.1182817590 0.0675895765 0.0366449511 0.0171009772
#> 7 8 9 10 11 12
#> 0.0091612378 0.0034609121 0.0020358306 0.0002035831 0.0002035831 0.0004071661
#> 14 16
#> 0.0004071661 0.0004071661
However, this is not the same as the estimated probability of any single flip belonging to a run of a particular length. To get this we need to multiply each element of runs
by its associated length to get an absolute value for how many flips belong to a run of each length:
results <- runs * as.numeric(names(runs))
results
#> 1 2 3 4 5 6 7 8 9 10 11 12 14 16
#> 2431 2448 1743 1328 900 504 315 136 90 10 11 24 28 32
Now if we want the proportion of flips that belonged to a run of one particular length, we can do:
results / num_flips
#>
#> 1 2 3 4 5 6 7 8 9 10 11
#> 0.2431 0.2448 0.1743 0.1328 0.0900 0.0504 0.0315 0.0136 0.0090 0.0010 0.0011
#> 12 14 16
#> 0.0024 0.0028 0.0032
This is the estimated probability that any single flip will belong to a run of the given length.
Created on 2022-09-26 with reprex v2.0.2
CodePudding user response:
You can simulate your 10000 coin flips and count the frequency of X consecutive 1s and X consecutive 0s with rle
function.
Example
# Your 10000 flip results
set.seed(1)
flips <- sample(c(0, 1), 10000, replace=TRUE)
# Consecutive 1s
nbConsecutive1s <- rle(flips==1)
table(nbConsecutive1s)
Output
The TRUE column corresponds to consecutive 1s.
The FALSE column corresponds to consecutive 0s.
As you can see, the number of consecutive 1s and 0s are close, whatever the length (this makes sense).
values
lengths FALSE TRUE
1 1232 1199
2 588 636
3 283 298
4 183 149
5 88 92
6 46 38
7 21 24
8 8 9
9 6 4
10 0 1
11 0 1
12 0 2
14 1 1
16 0 2
If you want to converge towards an estimated frequency, then you can launch the code k times and aggregate the results.
Launching it only once will give you results with variability around your estimated frequency.
Don't forget to use a seed if you want to make the example reproducible.
rle function and example
https://www.r-bloggers.com/2009/09/r-function-of-the-day-rle-2/