How to merge the list values continously and conditional upon a comparison in python?-CodePudding

I want to update the input list as it progresses through the loop.

"""
indata
a 1
b 2
c 3
g 7
j 10
k 11
o 15
p 16
q 17
r 18
v 22

The goal is to continuously merge the characters 
if their distance is less than certain distance between 
the last two characters and retaining the highest index 
of the input characters; in here let's say the distance is 2.

expected output: with min distance of 2

a-b-c 3
g 7
j-k 11
o-p-q-r 18
v 22
"""

from pprint import pprint

in_data = [['a', 1], ['b', 2], ['c', 3], 
           ['g', 7], ['j', 10], ['k', 11], 
           ['o', 15], ['p', 16], ['q', 17], 
           ['r', 18], ['v', 22]]

"""expected output: with min distance of 2
[['a-b-c', 3],
['g', 7],
['j-k', 11]
['o-p-q-r', 18]
['v', 22]]
"""

print("input data:")
pprint(in_data)

for i, (x1, y1) in enumerate(in_data):
    # print(i, x1, y1)
    for j, (x2, y2) in enumerate(in_data):
        if j <= i: continue
        if y2 - y1 <= 2:
            new_x = x1   '-'   x2
            new_y = max(y1, y2)
            in_data[i] = [new_x, new_y]
            del in_data[j]
            break
    # break

print("updated input:")
pprint(in_data)

So far I am able to reach here but not able to make further progress.

$ python merge_test.py 
input data: 
[['a', 1],
 ['b', 2],
 ['c', 3],
 ['g', 7],
 ['j', 10],
 ['k', 11],
 ['o', 15],
 ['p', 16],
 ['q', 17],
 ['r', 18],
 ['v', 22]]


updated input:
[['a-b', 2],
 ['c', 3],
 ['g', 7],
 ['j-k', 11],
 ['o-p', 16],
 ['q-r', 18],
 ['v', 22]]

Is there anyway to improve this loop, or do it any other way?

CodePudding user response：

Try the following approach:

in_data = [['a', 1], ['b', 2], ['c', 3], ['g', 7], ['j', 10], ['k', 11], ['o', 15], ['p', 16], ['q', 17], ['r', 18], ['v', 22]]
distance = 2
i = 0

while i < len(in_data) - 1:
    if in_data[i   1][1] - in_data[i][1] <= distance:
        in_data[i][0] =  in_data[i][0]   in_data[i   1][0]
        in_data[i][1] = in_data[i   1][1]
        del in_data[i   1]

    else:
        i  = 1

print(in_data)

This outputs:

[['abc', 3], ['g', 7], ['jk', 11], ['opqr', 18], ['v', 22]]

For such problems, it is best to make the changes in place (without creating a new array) and deleting the elements that become obsolete (using a while loop for index manipulation).

CodePudding user response：

I'd do this by iterating over in_data and accumulating the results in a new list:

>>> in_data = [['a', 1], ['b', 2], ['c', 3],
...            ['g', 7], ['j', 10], ['k', 11],
...            ['o', 15], ['p', 16], ['q', 17],
...            ['r', 18], ['v', 22]]
>>> out_data = [in_data[0]]
>>> for [char, num] in in_data[1:]:
...     if ord(char) - ord(out_data[-1][0][-1]) < 2:
...         out_data[-1][0]  = "-"   char
...         out_data[-1][1] = num
...     else:
...         out_data.append([char, num])
...
>>> out_data
[['a-b-c', 3], ['g', 7], ['j-k', 11], ['o-p-q-r', 18], ['v', 22]]

Then at the end you can do in_data = out_data (or in_data[:] = out_data if you want to make sure to mutate the original list) if you want the accumulated version to become the new in_data.