How to count runs of identical elements in a list?-CodePudding

I am working through Automate The Boring Stuff. I have been stuck for hours on a challenge. I am almost there. The challenge is to create a coin toss simulator, and count runs of like results. Specifically, I am looking to capture any run of like elements occurring 6 or more times in a row.

I've written the code below. It is right in 99% of circumstances except I happened to catch an error - it misses a run of 6 if that run is also the last 6 elements in the list (it may do this if it is the first 6, but I haven't been able to simulate it, and if I knew why it would do that, I could probably solve the problem). Any input would be gratefully received.

import random

# Generate a list simulating 100 coin tosses, where 'H' = Heads, 'T' = Tails

for experiment_number in range(10000):
    results = []
    for i in range(100):
        if random.randint(0, 1) == 0:
            results.append('H')
        else:
            results.append('T')


# Code that checks if there is a streak of 6 heads or tails in a row.


number_of_streaks = 0


last_item = None
current_streak = 0
for item in results:
    if item == last_item:
        current_streak  = 1
        if current_streak >= 6:
            number_of_streaks  = 1
    else:
        current_streak = 0
    last_item = item

print(results)
print(number_of_streaks)

As above, I've struggled, and come up with what's ultimately quite a straightforward solution. It's just wrong in a fraction of cases and I'm not sure why.

CodePudding user response：

Your error is when you re-start the current streak value, you should set it to 1 and not to 0. if not imagine you look for 3 consecutive 'H' if you have 'H', 'H', 'H', 'T' ... you miss the first strike.

More over when you count the number of strikes you should put your if statement when you re-start your count, if a round of for example 8 consecutive 'H' will count as 3 strikes while you only have one.

I have use a random seed equal to 25 so that you can reproduce the same result

random.seed(25)
results = []
for i in range(100):
    if random.randint(0, 1) == 0:
        results.append('H')
    else:
        results.append('T')

number_of_streaks = 0
last_item = None
current_streak = 0
for item in results:
    if item == last_item:
        current_streak  = 1
        
    else:
        if current_streak >= 6:
            number_of_streaks  = 1
        current_streak = 1
    last_item = item

print(results)
>>> 4

Note how in the result list there are 4 sequences higher than 6 conscutive values (marked in bold)

['T', 'H', 'H', 'T', 'T', 'H', 'T', 'H', 'T', 'T', 'H', 'H', 'H', 'T', 'H', 'T', 'T', 'H', 'H', 'T', 'T', 'T', 'H', 'H', 'T', 'H', 'H', 'T', 'H', 'H', 'H', 'H', 'H', 'H', 'T', 'T', 'T', 'T', 'T', 'T', 'H', 'T', 'T', 'H', 'H', 'T', 'T', 'T', 'H', 'T', 'H', 'T', 'T', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'T', 'H', 'H', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'T', 'H', 'T', 'T', 'T', 'H', 'H', 'H', 'H', 'T', 'H', 'T', 'H', 'H', 'H', 'H', 'T', 'H', 'T', 'H', 'H', 'T', 'H', 'T']

If you use current_streak = 0 you will get that there are only 2 sequences although there are 4. And if you have your original code with the if statement in the wrong place and the current_streak = 0 you get 9 which is also wrong.

Finally I think your initial for loop is not necessary

CodePudding user response：

I would do it this way:

streaks = [[streak,i] for i, streak in enumerate(results) if len(set(results[i:i 6]))==1 and len(results[i:i 6])==6]

With this list comprehension you will get a list of tuple with: ['coin_results', index] values, indicating the index and type of results which occurred exactly 6 times in a row in your experiments.

To handle the case where there are more than 6 occurrences of the same 'coin_results' e.g.:

streaks = [['H', 36], ['H', 37]]

I would pass streaks to this function

def real_occurrences(streaks):
      real_streaks = streaks
      for i,streak in enumerate(streaks[:-1]):
              if streaks[i][1] == streaks[i 1][1]-1:
                     real_streaks.pop(i 1)
      return len(real_streaks)

getting [['H', 36]]