Selectively sum elements of list in Python-CodePudding

Say I have a list of float like this one:

My goal is to sum the first and last elements of a each group within a range, in my case if it's plus 1 maximum, for example:

00000.0001   00001.0009
12345.0015   12346.0010
54321.0021   54321.0023

I'm trying to iterate the array using zip and range based on the length of the array, storing the first element of the group first, but cannot quite nail the solution.

I don't know much abound pandas, but would that help? Any other tip would be appreciated, don't necessarily need the exact solution.

CodePudding user response：

Just iterate over the list and keep track of how often you cross the threshold. Keep track of the value when you last crossed it and add them together.

things = [1,2,3,4,10,11,12,21,22,23]
start = things[0]
threshold = 10
totals=[]

for i, n in enumerate(things):
    if n - start  >= threshold:
        totals.append(start things[i-1])
        start = n

if start != n:
    totals.append(start n)
else:
    totals.append(n)
   
      
print(totals)
assert sum(totals) == (1 10) (11 12) (21 23)

CodePudding user response：

You can use zip to identify the elements that are at more than 1.0 (threshold) away from their predecessor. Applying a cumulative sum on these breaks will produce group identifiers that can then be used with groupby():

L = [0.0001,
0.0002,
0.0003,
0.0004,
1.0009,
12345.0015,
12345.0016,
12345.0017,
12345.0018,
12345.0019,
12346.0010,
54321.0021,
54321.0022,
54321.0023]

from itertools import groupby,accumulate

threshold = 1.0
groups = accumulate(b-a>threshold for a,b in zip(L[:1] L,L))
result = [ (g[0],g[-1]) for g,g[:] in groupby(L,lambda _:[next(groups)]) ]

print(result)           
[(0.0001, 0.0004), 
 (1.0009, 1.0009),           # 1.0009 - 0.0004 = 1.0005 > 1.0
 (12345.0015, 12346.001), 
 (54321.0021, 54321.0023)]