capturing occurrences and finding its count in a list-CodePudding

I have a situation where a list of last occurrences is available. Say the list is. ['user', 'user', 'sys', 'sys', 'user', 'user', 'sys', 'user'] So the ask is if a user has occurred how many times did it occur consecutively? if there is a change in element say 'sys' instead of 'user' the count should start fresh. The output I am looking for is [('user', 2), ('sys', 2), ('user', 2), ('sys', 1), ('user', 1)] this would help me identify the pattern the user and system is following. Any help is much appreciated.

CodePudding user response：

Use itertools.groupby:

from itertools import groupby

lst = ['user', 'user', 'sys', 'sys', 'user', 'user', 'sys', 'user']

out = [(value, sum(1 for _ in group)) for value, group in groupby(lst)]
print(out)

Prints:

[('user', 2), ('sys', 2), ('user', 2), ('sys', 1), ('user', 1)]

CodePudding user response：

You could install the more-itertools package and use its run_length.encode function:

In [1]: from more_itertools import run_length

In [2]: list(run_length.encode(['user', 'user', 'sys', 'sys', 'user', 'user', 'sys', 'user']))
Out[2]: [('user', 2), ('sys', 2), ('user', 2), ('sys', 1), ('user', 1)]

CodePudding user response：

You can iterate over the list and count the values using the stack phenomenon without any additional library. Below is the logic:

l = ['user', 'user', 'sys', 'sys', 'user', 'user', 'sys', 'user']

f = []
x = [l[0], 0]

for i in l:
    if i == x[0]:
        x[1]  = 1
    else:
        f.append(tuple(x))
        # Resetting the value
        x = [i, 1]

# Adding the last iteration value
f.append(tuple(x))

print(f)

where, x = temporary list which keeps a track of count of topmost value. Initially, I have started with zero as I am looping first value again.

Output: f -> [('user', 2), ('sys', 2), ('user', 2), ('sys', 1), ('user', 1)]