How to group approximately adjacent list-CodePudding

I have a list that has approximately adjacent.

x=[10,11,13,70,71,73,170,171,172,174]

I need to separate this into lists which has minimum deviation (i.e)

y=[[10,11,13],[70,71,73],[170,171,172,174]]

You can see in y list grouped into 3 separate lists and break this list when meeting huge deviation. Can you give me a tip or any source to solve this?

CodePudding user response：

the zip function is your friend when you need to compare items of a list with their successor or predecessor:

x=[10,11,13,70,71,73,170,171,172,174]

threshold = 50
breaks    = [i for i,(a,b) in enumerate(zip(x,x[1:]),1) if b-a>threshold]
groups    = [x[s:e] for s,e in zip([0] breaks,breaks [None])]

print(groups)
[[10, 11, 13], [70, 71, 73], [170, 171, 172, 174]]

breaks will contain the index (i) of elements (b) that are greater than their predecessor (a) by more than the treshold value.
Using zip() again allows you to pair up these break indexes to form start/end ranges which you can apply to the original list to get your groupings.

Note that i used a fixed threshold to detect a "huge" deviation, but you can use a percentage or any formula/condition of your choice in place of if b-a>threshold. If the deviation calculation is complex, you will probably want to make a deviates() function and use it in the list comprehension: if deviates(a,b) so that it remains intelligible

If zip() and list comprehensions are too advanced, you can do the same thing using a simple for-loop:

def deviates(a,b):  # example of a (huge) deviation detection function
    return b-a > 50  

groups   = []   # resulting list of groups
previous = None # track previous number for comparison
for number in x:
    if not groups or deviates(previous, number): 
        groups.append([number])   # 1st item or deviation, add new group 
    else:
        groups[-1].append(number) # approximately adjacent, add to last group
    previous = number             # remember previous value for next loop

CodePudding user response：

Something like this should do the trick:

test_list = [10, 11, 13, 70, 71, 73, 170, 171, 172, 174]


def group_approximately_adjacent(numbers):
    if not numbers:
        return []

    current_number = numbers.pop(0)
    cluster = [current_number]
    clusters = [cluster]

    while numbers:
        next_number = numbers.pop(0)
        if is_approximately_adjacent(current_number, next_number):
            cluster.append(next_number)
        else:
            cluster = [next_number]
            clusters.append(cluster)
        current_number = next_number

    return clusters


def is_approximately_adjacent(a, b):
    deviation = 0.25
    return abs(a * (1   deviation)) > abs(b) > abs(a * (1 - deviation))