Home > Software design >  Return large list of tuples with replaced dictionary value
Return large list of tuples with replaced dictionary value

Time:03-22

In python, I have a list of tuples (lot) with patient data, as shown below:

lot = [('490001', 'A-ARM1', '1', '2', "a", "b"),
       ('490001', 'A-ARM2', '3', '4', "c", "d"),
       ('490002', 'B-ARM3', '5', '6', "e", "f")]

In my real dataset, lot consists of 50-150 tuples (dependent on the patient). I loop through every second tuple element and wish to replace every 'A-' and 'B-' characters by a dictionary value, so the output will become:

[('490001', 'ZZARM1', '1', '2', 'a', 'b'), ('490001', 'ZZARM2', '3', '4', 'c', 'd'), ('490002', 'XXARM3', '5', '6', 'e', 'f')]

To satisfy this, I've written the code below. Here, I was wondering if there is a cleaner (shorter) way of writing this. For example, 'lot2'. The code should work optimally for a large list of tuples, as stated above. I'm eager to learn from you!

from more_itertools import grouper
dict = {'A-': 'ZZ', 'B-': 'XX'}

for el1, el2, *rest in lot:
    for i, j in grouper(el2, 2):
        if i   j in dict:
            lot2 = [ ( tpl[0], (tpl[1].replace(tpl[1][:2], dict[tpl[1][:2]])), tpl[2], tpl[3], tpl[4], tpl[5] ) for tpl in lot]
print(lot2)

CodePudding user response:

If you're looking for a shorter code, here's a shorter code that doesn't used more_itertools.grouper. Basically, iterate over lot and modify second elements as you go (if it needs to be changed). Note that I named dict to dct here; dict is the builtin dict constructor, naming your variables the same as Python builtins create problems later on.

lot2 = []
for el1, el2, *rest in lot:
    prefix = el2[:2]
    if prefix in dct:
        el2 = dct[prefix]   el2[2:]
    lot2.append((el1, el2, *rest))

If the prefix of the second elements are always in dct, we don't even need the if-condition there, below should do as well:

lot2 = []
for el1, el2, *rest in lot:
    prefix = el2[:2]
    el2 = dct[prefix]   el2[2:]
    lot2.append((el1, el2, *rest))

which can be written even more concisely:

lot2 = [(el1, dct[el2[:2]]   el2[2:], *rest) for el1, el2, *rest in lot]

Output:

[('490001', 'ZZARM1', '1', '2', 'a', 'b'),
 ('490001', 'ZZARM2', '3', '4', 'c', 'd'),
 ('490002', 'XXARM3', '5', '6', 'e', 'f')]
  • Related