Home > Software design >  How to replace duplicates with string and keep only first element in the list of lists?
How to replace duplicates with string and keep only first element in the list of lists?

Time:10-08

my_list = [
     ['common1', '112', '4000'],
     ['common1', '11', '11'],
     ['common1', '33', '33'],
     ['common1', '33', '900'], 
     ['common2', '31', '400'],
     ['common2', '2', '2666']
]

 

I want to convert this list into

 [
    ['common1', '112', '4000'],
    ['', '11', '11'], 
    ['', '33', '33'],
    ['', '33', '900'], 
    ['common2', '31', '400'], 
    ['', '2', '2666']
 ] 

Here what I want is replace the common1 with empty string '' and keep only the first value. Similarly for common2 value.

CodePudding user response:

You can try this:

def replace_duplicate(my_list):
    prev_val = ''

    for lst in my_list:
        if prev_val == lst[0]:
            lst[0] = ''
        else:
            prev_val = lst[0]

my_list = [
    ['common1', '112', '4000'],
    ['common1', '11', '11'],
    ['common1', '33', '33'],
    ['common1', '33', '900'], 
    ['common2', '31', '400'],
    ['common2', '2', '2666'],
    ['common3', '115', '5000'],
    ['common3', '12', '15'],
    ['common1', '222', '6000'],
    ['common1', '55', '66'],
    ['common1', '77', '99']
]

replace_duplicate(my_list)
print(my_list)

Output:

[
    ['common1', '112', '4000'],
    ['', '11', '11'],
    ['', '33', '33'],
    ['', '33', '900'],
    ['common2', '31', '400'],
    ['', '2', '2666'],
    ['common3', '115', '5000'],
    ['', '12', '15'],
    ['common1', '222', '6000'],
    ['', '55', '66'],
    ['', '77', '99']
]

CodePudding user response:

Here's a super simple and easy solution

def remove_sec_dups(data):
    seen = set()  # keep track of first seen values

    for sub in data:  # get sublists
        first = sub[0]  # get first item
        if first in seen:
            sub[0] = ''  # empty
        else:
            seen.add(first)  # add this to our seen list

remove_sec_dups(my_list)
print(my_list)

This outputs [['common1', '112', '4000'], ['', '11', '11'], ['', '33', '33'], ['', '33', '900'], ['common2', '31', '400'], ['', '2', '2666']]

CodePudding user response:

The following works for the general unsorted case, too:

seen = set()
for i, (head, *_) in enumerate(my_list):
    if head in seen:
        my_list[i][0] = ""
    seen.add(head)
  • Related