Home > Software design >  How can i count the first and the last index of max sequence?
How can i count the first and the last index of max sequence?

Time:06-04

monets = []
for i in range(20):
    choices = ['Tails', 'Eagle']
    monets.append(random.choice(choices))
cnt = 0
prev = 0
for i, e in enumerate(monets):
    if e == 'Eagle':
        cnt  = 1
        if e == 'Eagle' and i == len(monets) - 1 and cnt > prev:
                prev = cnt
    elif e != 'Eagle':
        if prev < cnt:
            prev = cnt
        cnt = 0

print(monets)
print(prev)

My code calculates the max sequence of 'Eagle' in random generated list, but i stuck on how to calculate first and last index of this sequence. I figured out that using enumerate may help me with it, but i mixed up. Example: ['Tails', 'Eagle','Eagle','Tails','Eagle'] => output: 1,2

CodePudding user response:

This should works, this is a simple algorithm, you don't need any sophisticated libraries:

(revision 2)

m = 0
c = 0
p = -1

for [i,s] in enumerate(monets):
    if s == 'Eagle':
        c  = 1
    else:
        c = 0
    if c > m: 
        m = c
        p = i

print('max Eagle:', m, 'from:', p   1 - m, 'to:', p)

CodePudding user response:

You could also use itertools.groupby to get groups of consecutive "Eagles". Combine that with enumerate, as in your approach, to pair them with the indices, and use max to find the longest sequence. Finally, get the indices from the first and last elements of that list.

>>> from itertools import groupby                                                                           
>>> monets = ['Tails', 'Eagle','Eagle','Tails','Eagle']                                                     
>>> max((list(g) for k, g in groupby(enumerate(monets), key=lambda x: x[1]) if k == "Eagle"), key=len)      
[(1, 'Eagle'), (2, 'Eagle')]
>>> _[0][0], _[-1][0]                                                                                       
(1, 2)

CodePudding user response:

Just reading your code, looks like you've got following computation working (i.e. generally correct, but I didn't actually run and test for bugs)

['Tails', 'Eagle','Eagle','Tails','Eagle']  # monets list
[      0,       1,      2,      0,      1]  # 'Eagle' sequence lengths

There are a few different ways to do what you want, but continuing on your existing methodology, you can indeed use enumerate to generate the following:

[ (0, 0),  (1, 1), (2, 2), (3, 0), (4, 1)]  # seq lengths from before, enumerated

Where each pair represents: (index, length)

From that, find the pair with the largest length, and you'll have the end index of the sequence, in this case: (2, 2).

The first instance of length == 1, searching backwards from the end index, will give you the start index.

Sidenote: @tobias_k's answer is written in a more functional style (which I also personally prefer). It's a different methodology than you've started with, but I highly recommend learning it. Here is that method written more (IMO) readably:

import itertools as it

monets = ['Tails', 'Eagle','Eagle','Tails','Eagle']
grouped = it.groupby(enumerate(monets), key=lambda pair: pair[1])
eagle_seqs = [list(seq) for v, seq in grouped if v == 'Eagle']
longest_seq = max(eagle_seqs, key=len)
seq_idxs = [i for i, _ in longest_seq]

start_idx, end_idx = seq_idxs[0], seq_idxs[-1]

CodePudding user response:

This is the most elegant solution to this problem:

import random
import numpy as np
import pandas as pd

monets = []
for i in range(20):
    choices = ['Tails', 'Eagle']
    monets.append(random.choice(choices))

Here the only additional thing to do is to encode the seq into num values and find the longest contiguous sequence of indices:

encode_ = {'Tails': 0, 'Eagle': 1}
df = pd.DataFrame(monets).replace(encode_)
A = np.where(df == 1)[0]
result = max(np.split(A, np.where(np.diff(A) != 1)[0]   1), key=len).tolist()
start_idx, end_idx = result[0],result[-1]

CodePudding user response:

Using a down-to-ground approach: (it returns the position of the 1st maximal sequence of consecutive terms)

lst = ['Tails', 'Eagle', 'Eagle','Tails', 'Eagle', 'Eagle','Eagle', 'Eagle', 'Tails', 'Eagle', 'Eagle','Tails']

index, counter = -1, 0
tmp_i, tmp_c = -1, 0
for i, v in enumerate(lst):
    if v == 'Eagle':
        # tmp-update
        tmp_c  = 1
        if tmp_i == -1:
            tmp_i = i
    else:
        if tmp_c > counter:
            # global update
            counter = tmp_c
            index = tmp_i
        # reset
        tmp_i, tmp_c = -1, 0

# final check for occurrence of max sequence at the end of the list 
if tmp_c > counter:
    # global update
    counter = tmp_c
    index = tmp_i

boundaries_max_seq = (index, index   counter - 1)

print(boundaries_max_seq)
# (4, 7)
  • Related