Mapping items in one list to the items in another list-CodePudding

I have two lists in Python and I'm trying to map the values of one to the other.

List 1 (coordinates):

['7,16', '71,84', '72,48', '36,52', '75,36', '52,28', '76,44', '11,69', '56,35',
 '15,21', '32,74', '88,32', '10,74', '61,34', '51,85', '10,75', '55,96',
 '94,12', '34,64', '71,59', '76,75', '25,16', '54,100', '62,1', '60,85',
 '16,32', '14,77', '40,78', '2,60', '71,4', '78,91', '100,98', '42,32', '37,49',
 '49,34', '3,5', '42,77', '39,60', '38,77', '49,40', '40,53', '57,48', '14,99',
 '66,67', '10,9', '97,3', '66,76', '86,68', '10,60', '8,87']

List 2 (index):

[3, 2, 3, 3, 3, 3, 3, 1, 3, 3, 2, 3, 1, 3, 2, 1, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3,
 1, 2, 1, 3, 2, 2, 3, 3, 3, 3, 2, 2, 2, 3, 3, 3, 1, 2, 3, 3, 2, 2, 1, 1]

For the output, I need to have something like:

cluster_1: [x, y], [a,b]...

cluster_2: [c, d], [e, f]...

cluster_3: [g, h], [o, j]...

I tried doing this in a dictionary, but I can only get it to put in the last coordinate in the for loop for each value. It also always outputs keys starting from 0, and I'm looking to label them starting from 1.

for i in range(len(patients)):
    # other stuff
    k = 3
    for b in range(k):
        if cluster == (k - b):
            dct['cluster_%s' % b] = patients[i]

which outputs:

{'cluster_0': '97,3', 'cluster_1': '86,68', 'cluster_2': '8,87'}

I've tried using dct['cluster_%s' % b].append(patients[i]) but I get a key error on cluster_0. Any help would be much appreciated!

CodePudding user response：

You can zip your indices and coordinates, then loop over them element-wise and populate a dictionary based on the index.

clusters = {}
for idx, coord in zip(index, coords):
    if idx in clusters:
        clusters[idx].append(coord.split(','))
    else:
        clusters[idx] = [coord.split(',')]

result, where clusters[i] refers the the i-th cluster.

>>> clusters
{
    3: [['7', '16'], ['72', '48'], ['36', '52'], ['75', '36'], ['52', '28'], ['76', '44'], ['56', '35'], ['15', '21'], ['88', '32'], ['61', '34'], ['94', '12'], ['71', '59'], ['25', '16'], ['62', '1'], ['16', '32'], ['71', '4'], ['42', '32'], ['37', '49'], ['49', '34'], ['3', '5'], ['49', '40'], ['40', '53'], ['57', '48'], ['10', '9'], ['97', '3']],
    2: [['71', '84'], ['32', '74'], ['51', '85'], ['55', '96'], ['34', '64'], ['76', '75'], ['54', '100'], ['60', '85'], ['40', '78'], ['78', '91'], ['100', '98'], ['42', '77'], ['39', '60'], ['38', '77'], ['66', '67'], ['66', '76'], ['86', '68']],
    1: [['11', '69'], ['10', '74'], ['10', '75'], ['14', '77'], ['2', '60'], ['14', '99'], ['10', '60'], ['8', '87']]
}

CodePudding user response：

Here just another way using itertools. However, I would also use Cory Kramer's answer as it is simple and easy to read (and therefore preferable)!

from itertools import groupby

data = sorted(zip(cluster_indices, values), key=lambda x: x[0])
grouped = groupby(data, key=lambda x: x[0])
clusters = {
    cluster: [value[1].split(",") for value in list(values)] 
    for cluster, values in grouped
}

print(clusters)

Cory Kramer's solution extracted to a function so it can be reused:

def groupby(indices, values, map_fn):
    grouped = {}
    for id, value in zip(indices, values):
        if id in grouped:
            grouped[id].append(map_fn(id, value))
        else:
            grouped[id] = [map_fn(id, value)]
    return grouped

clusters = groupby(cluster_indices, values, lambda _, value: value.split(","))

print(clusters)