Home > Blockchain >  How do I add a underscore to a duplicate value in a list?
How do I add a underscore to a duplicate value in a list?


I have the following list:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

All I need is to add a underscore to duplicate values, in order to get something like this:

['rs1', 'rs1_', 'rs2', 'rs2_', 'rs3', 'rs3_', 'rs4', 'rs4_', 'rs5', 'rs5_']

I've checked some similiar public questions but only menage to add a number after each value (therefore getting something like:

['rs11', 'rs12', 'rs21', 'rs22', 'rs31', 'rs32', 'rs41', 'rs42', 'rs51', 'rs52']

which is definitely far from my aim).

Any ideas?

Thank you.

CodePudding user response:

By looping through your poly list and keeping track of elements you've seen before, you can identify and add an _ to each duplicate in the original list:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

already_encountered = []

for x in range(len(poly)):
    if poly[x] in already_encountered:
        poly[x] = f'{poly[x]}_'



['rs1', 'rs1_', 'rs2', 'rs2_', 'rs3', 'rs3_', 'rs4', 'rs4_', 'rs5', 'rs5_']

CodePudding user response:

One-liner for your decoding pleasure :)

[ poly[i]   '_' if poly[i] in poly[:i] else poly[i] for i in range(len(poly)) ]

CodePudding user response:

A more ideomatic version of PangolinPaws answer would be:

poly = ['rs1', 'rs1', 'rs2', 'rs2', 'rs3', 'rs3', 'rs4', 'rs4', 'rs5', 'rs5']

seen = set()

for idx, v in enumerate(poly):
    if v in seen:
        poly[idx] = f'{v}_'

    seen.add(v) # sets are unique by nature - no need to put it in an else


with the same overall outcome but more performant:

  • usage of set() makes the "in" checks O(1)
  • usage of enumerate(...) removes the need of indexing into the list multiple times
  • Related