How do I add a underscore to a duplicate value in a list?-CodePudding

I have the following list:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

All I need is to add a underscore to duplicate values, in order to get something like this:

['rs1', 'rs1_', 'rs2', 'rs2_', 'rs3', 'rs3_', 'rs4', 'rs4_', 'rs5', 'rs5_']

I've checked some similiar public questions but only menage to add a number after each value (therefore getting something like:

['rs11', 'rs12', 'rs21', 'rs22', 'rs31', 'rs32', 'rs41', 'rs42', 'rs51', 'rs52']

which is definitely far from my aim).

Any ideas?

Thank you.

CodePudding user response：

By looping through your poly list and keeping track of elements you've seen before, you can identify and add an _ to each duplicate in the original list:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

already_encountered = []

for x in range(len(poly)):
    if poly[x] in already_encountered:
        poly[x] = f'{poly[x]}_'
    else:
        already_encountered.append(poly[x])

print(poly)

Output:

['rs1', 'rs1_', 'rs2', 'rs2_', 'rs3', 'rs3_', 'rs4', 'rs4_', 'rs5', 'rs5_']

CodePudding user response：

One-liner for your decoding pleasure :)

[ poly[i]   '_' if poly[i] in poly[:i] else poly[i] for i in range(len(poly)) ]

CodePudding user response：

A more ideomatic version of PangolinPaws answer would be:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

seen = set()

for idx, v in enumerate(poly):
    if v in seen:
        poly[idx] = f'{v}_'

    seen.add(v) # sets are unique by nature - no need to put it in an else

print(poly)

with the same overall outcome but more performant:

usage of set() makes the "in" checks O(1)
usage of enumerate(...) removes the need of indexing into the list multiple times