I have a list like this:
ls = [0, 1, 2, 4, 6, 7] # it won't have duplicates and it is sorted
Now I want to group this list into bins based on a offset (in this example offset=1) which should return this:
[[0, 1, 2], [4], [6, 7]]
# Note:
# The offset in [0, 1, 2] isn't 1 for 0 and 2,
# but it is for 1 and 2 and this is what I want
Is there an high-level function in numpy, scipy, pandas, etc. which will provide my desired result?
Note: The returned datastructure doesn't have to be a list, any is welcomed.
CodePudding user response:
Using pure python:
ls = [0, 1, 2, 4, 6, 7]
def group(l, offset=1):
out = []
tmp = []
prev = l[0]
for val in l:
if val-prev > offset:
out.append(tmp)
tmp = []
tmp.append(val)
prev = val
out.append(tmp)
return out
group(ls)
# [[0, 1, 2], [4], [6, 7]]
With pandas:
import pandas as pd
offset = 1
s = pd.Series(ls)
s.groupby(s.diff().gt(offset).cumsum()).agg(list)
output:
0 [0, 1, 2]
1 [4]
2 [6, 7]
dtype: object
With numpy:
import numpy as np
offset = 1
a = np.split(ls, np.nonzero(np.diff(ls)>offset)[0] 1)
# [array([0, 1, 2]), array([4]), array([6, 7])]