Editing list elements using comprehension deletes part of list-CodePudding

I have a number of nested lists from a web scraped table that I want to 'clean' by removing unhelpful HTML characters. They look like this:

example_list = ['12.7x55 mm PS12B',
  '<td style="border-bottom:solid 2px">102\n</td>',
  '<td style="border-bottom:solid 2px">46\n</td>',
  '<td style="border-bottom:solid 2px">57\n</td>',
  '<td style="border-bottom:solid 2px; background-color:#00990080;">6\n</td>',
  '<td style="border-bottom:solid 2px; background-color:#00640080;">5\n</td>',
  '<td style="border-bottom:solid 2px; background-color:#FB9C0E80;">4\n</td>']

I would like it to look like this:

my_list =  ['12.7x55 mm PS12B', '102', '46', '57', '6', '5', '4']

I tried simple comprehensions:

my_list[1:] = [i.replace('\n</td>', '') for i in list] # works perfectly
my_list[1:] = [i.replace('<td>', '') for i in list] # works perfectly
# for example the second item in the list is now `102`
# not `<td style="border-bottom:solid 2px">102\n</td>`

but when I try to edit the last six elements using a more specific comprehension:

my_list[1:] = [i.replace(i, i[-1]) for i in list if "back" in i]

It deletes all other list elements that I have just extracted, and I end up with:

my_list =  ['12.7x55 mm PS12B', '6', '5', '4']

I am sure being HTML there is a less obscure method to do this (which I would appreciate knowing) but my main concern is that I don't understand what's going on with a simple python comprehension.

CodePudding user response：

The rest of the elements are filtered out by the if condition in the comprehension. If you wish to keep them, you need to add the else clause:

my_list[1:] = [
    i.replace(i, i[-1])
    for i in list
    if "back" in i
    else i  # or however you wish to process the rest of the elements
]