In Python, how to remove items in a list based on the specific string format?-CodePudding

I have a Python list as below:

merged_cells_lst = [
'P19:Q19
'P20:Q20
'P21:Q21
'P22:Q22
'P23:Q23
'P14:Q14
'P15:Q15
'P16:Q16
'P17:Q17
'P18:Q18
'AU9:AV9
'P10:Q10
'P11:Q11
'P12:Q12
'P13:Q13
'A6:P6
'A7:P7
'D9:AJ9
'AK9:AQ9
'AR9:AT9'
'A1:P1'
]

I only want to unmerge the cells in the P and Q columns. Therefore, I seek to remove any strings/items in the merged_cells_lst that does not have the format "P##:Q##".

I think that regex is the best and most simple way to go about this. So far I have the following:

for item in merge_cell_lst:
    if re.match(r'P*:Q*'):
            pass
    else:
            merged_cell_lst.pop(item)

print(merge_cell_lst)

The code however is not working. I could use any additional tips/help. Thank you!

CodePudding user response：

Modifying a list while looping over it causes troubles. You can use list comprehension instead to create a new list.

Also, you need a different regex expression. The current pattern P*:Q* matches PP:QQQ, :Q, or even :, but not P19:Q19.

import re

merged_cells_lst = ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15', 'P16:Q16', 'P17:Q17', 'P18:Q18', 'AU9:AV9', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13', 'A6:P6', 'A7:P7', 'D9:AJ9', 'AK9:AQ9', 'AR9:AT9', 'A1:P1']

p = re.compile(r"P\d :Q\d ")

output = [x for x in merged_cells_lst if p.match(x)]
print(output)
# ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15',
#  'P16:Q16', 'P17:Q17', 'P18:Q18', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13']

CodePudding user response：

Your list has some typos, should look something like this:

merged_cells_lst = [ 'P19:Q19', 'P20:Q20', 'P21:Q21', ...]

Then something as simple as:

x = [k for k in merged_cells_lst if k[0] == 'P']

would work. This is assuming that you know a priori that the pattern you want to remove follows the Pxx:Qxx format. If you want a dynamic solution then you can replace the condition in the list comprehension with a regex match.