I have a list of about 75000 which contain random sequence of 1s and -1s. I want to count how many time 1 has appeared one time and two times and three times and so on. So goes for the -1s.
For example My_list = [1,1,1,-1,-1,1,1,1,-1,-1,1,-1,1...]
expected output would be : (1:1x2,2x0,3x2|-1:1x1,2x2}) in words: 1 repeated once 2 times, twice 0 times, thrice repeated 2 times -1 repeated once 1 time, twice 2 times
Thank you
I am very new to python learning it especially for my trading project. I cant go further than counting the total number occurrence of a given value rather than counting the repeated number of occurrence
CodePudding user response:
Approach
Code
from itertools import groupby
from collections import Counter
def run_stats(lst):
def rle(lst):
' Run length encoding of the runs (value, runlength) '
return [(key, len(list(group))) for key, group in groupby(lst)]
# Count of (value, runlength) pairs
cnts = Counter(rle(lst))
# Aggregate runs of 1/-1 tuples in lists
stats = {1:[], # 1 run pairs
-1:[]} # -1 run pairs
for tup, cnt in cnts.items():
val, runlength = tup
stats[val].append((runlength, cnt))
return stats
Usage
# Test Data
lst = [1,1,1,-1,-1,1,1,1,-1,-1,1,-1,1]
# Generate result
results = run_stats(lst)
# Format Output (using JamieDoombos formatting)
for val in results:
print(f'{val}:', ','.join(f'{run}x{count}' for run, count in results[val]))
Output
1: 3x2,1x2
-1: 2x2,1x1
CodePudding user response:
I suggest iterating over the list using an index, and for each new item, count the occurrences and store the result in a dict.
The resulting runs
dict in this code has the list domain (-1, 1) as keys. Each value in the dict is another dict with the run lengths as keys, and the number of occurrences of the run length as values.
from collections import defaultdict
My_list = [1,1,1,-1,-1,1,1,1,-1,-1,1,-1,1]
# map of values to a map of lengths to the number of occurrences
runs = defaultdict(lambda: defaultdict(int))
list_length = len(My_list)
index = 0
while index < list_length:
item = My_list[index]
run_length = 1
index = 1
while index < list_length and My_list[index] == item:
index = 1
run_length = 1
runs[item][run_length] = 1
for value, value_runs in runs.items():
print(f'{value}:', ','.join(f'{run}x{count}' for run, count in value_runs.items()))
Result:
1: 1x2,2x0,3x2
-1: 1x1,2x2,3x0
EDIT: this uses a defaultdict that handles any number of consecutive values and values outside the domain.
CodePudding user response:
You could try this - use groupby with defaultdict(list) to loop and count/group similar 1 or -1 together:
The defaultdict dd will have all 1 or -1 frequency times (spreads) in the final list. Then you choose the way you want to print (format).
L = [1,1,1,-1,-1,1,1,1,-1,-1,1,-1,1,1,1,1,1,-1,-1,-1] # sample input
from itertools import groupby
from collections import defaultdict
dd = defaultdict(list)
for k, g in groupby(L, lambda x: x>0):
if k: # True
dd[1].append(len(list(g)))
else: # False :: -1
dd[-1].append(len(list(g)))
print(dd)
# defaultdict(<class 'list'>, {1: [3, 3, 1, 5], -1: [2, 2, 1, 3]})