Home > Software engineering >  Remove duplicate lines with a specific string from a file
Remove duplicate lines with a specific string from a file

Time:11-14

I have a file in which i have to remove the duplicate lines with same string at the last three positions

file.txt contains

['aabbccj', 'biukghk', 'hgkfhff', 'hsgfccj', ' jflgsfs', 'fskfyhd', 'bfsbkhd', 'fjlfghk']

i want the output as

['aabbccj', 'biukghk', 'hgkfhff', ' jflgsfs', 'fskfyhd', 'bfsbkhd']

CodePudding user response:

Simply create a list of your endings. In your loop through the list, store them in a separate list and check for any new iteration:

list = ['aabbccj',  'biukghk',  'hgkfhff',  'hsgfccj', ' jflgsfs', 'fskfyhd',  'bfsbkhd',  'fjlfghk']
endings = []
results = []
for entry in list:
  if entry[-3:] in endings:
    # ending found, skip the rest
    continue

  results.append(entry)
  endings.append(entry[-3:])

Should do the trick

CodePudding user response:

If you want to remove elements which have last 3 characters, you can try this

my_list = ['aabbccj',  'biukghk',  'hgkfhff',  'hsgfccj', ' jflgsfs', 'fskfyhd',  'bfsbkhd',  'fjlfghk']
suffix = set()
output = []
for ele in my_list:
    #if common suffix detected skip the element
    if ele[-3:] in suffix:
        continue
    #else add suffix to set and append element to output
    else:
        suffix.add(ele[-3:])
        output.append(ele)
print(output)
#Output: ['aabbccj', 'biukghk', 'hgkfhff', ' jflgsfs', 'fskfyhd', 'bfsbkhd']
  • Related