Home > Net >  How to group approximately adjacent list
How to group approximately adjacent list

Time:10-05

I have a list that has approximately adjacent.

x=[10,11,13,70,71,73,170,171,172,174]

I need to separate this into lists which has minimum deviation (i.e)

y=[[10,11,13],[70,71,73],[170,171,172,174]]

You can see in y list grouped into 3 separate lists and break this list when meeting huge deviation. Can you give me a tip or any source to solve this?

CodePudding user response:

the zip function is your friend when you need to compare items of a list with their successor or predecessor:

x=[10,11,13,70,71,73,170,171,172,174]

threshold = 50
breaks    = [i for i,(a,b) in enumerate(zip(x,x[1:]),1) if b-a>threshold]
groups    = [x[s:e] for s,e in zip([0] breaks,breaks [None])]

print(groups)
[[10, 11, 13], [70, 71, 73], [170, 171, 172, 174]]
  • breaks will contain the index (i) of elements (b) that are greater than their predecessor (a) by more than the treshold value.
  • Using zip() again allows you to pair up these break indexes to form start/end ranges which you can apply to the original list to get your groupings.

Note that i used a fixed threshold to detect a "huge" deviation, but you can use a percentage or any formula/condition of your choice in place of if b-a>threshold. If the deviation calculation is complex, you will probably want to make a deviates() function and use it in the list comprehension: if deviates(a,b) so that it remains intelligible

If zip() and list comprehensions are too advanced, you can do the same thing using a simple for-loop:

def deviates(a,b):  # example of a (huge) deviation detection function
    return b-a > 50  

groups   = []   # resulting list of groups
previous = None # track previous number for comparison
for number in x:
    if not groups or deviates(previous, number): 
        groups.append([number])   # 1st item or deviation, add new group 
    else:
        groups[-1].append(number) # approximately adjacent, add to last group
    previous = number             # remember previous value for next loop

CodePudding user response:

Something like this should do the trick:

test_list = [10, 11, 13, 70, 71, 73, 170, 171, 172, 174]


def group_approximately_adjacent(numbers):
    if not numbers:
        return []

    current_number = numbers.pop(0)
    cluster = [current_number]
    clusters = [cluster]

    while numbers:
        next_number = numbers.pop(0)
        if is_approximately_adjacent(current_number, next_number):
            cluster.append(next_number)
        else:
            cluster = [next_number]
            clusters.append(cluster)
        current_number = next_number

    return clusters


def is_approximately_adjacent(a, b):
    deviation = 0.25
    return abs(a * (1   deviation)) > abs(b) > abs(a * (1 - deviation))
  • Related