I have a Python list as below:
merged_cells_lst = [
'P19:Q19
'P20:Q20
'P21:Q21
'P22:Q22
'P23:Q23
'P14:Q14
'P15:Q15
'P16:Q16
'P17:Q17
'P18:Q18
'AU9:AV9
'P10:Q10
'P11:Q11
'P12:Q12
'P13:Q13
'A6:P6
'A7:P7
'D9:AJ9
'AK9:AQ9
'AR9:AT9'
'A1:P1'
]
I only want to unmerge the cells in the P and Q columns. Therefore, I seek to remove any strings/items in the merged_cells_lst that does not have the format "P##:Q##"
.
I think that regex is the best and most simple way to go about this. So far I have the following:
for item in merge_cell_lst:
if re.match(r'P*:Q*'):
pass
else:
merged_cell_lst.pop(item)
print(merge_cell_lst)
The code however is not working. I could use any additional tips/help. Thank you!
CodePudding user response:
Modifying a list while looping over it causes troubles. You can use list comprehension instead to create a new list.
Also, you need a different regex expression. The current pattern P*:Q*
matches PP:QQQ
, :Q
, or even :
, but not P19:Q19
.
import re
merged_cells_lst = ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15', 'P16:Q16', 'P17:Q17', 'P18:Q18', 'AU9:AV9', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13', 'A6:P6', 'A7:P7', 'D9:AJ9', 'AK9:AQ9', 'AR9:AT9', 'A1:P1']
p = re.compile(r"P\d :Q\d ")
output = [x for x in merged_cells_lst if p.match(x)]
print(output)
# ['P19:Q19', 'P20:Q20', 'P21:Q21', 'P22:Q22', 'P23:Q23', 'P14:Q14', 'P15:Q15',
# 'P16:Q16', 'P17:Q17', 'P18:Q18', 'P10:Q10', 'P11:Q11', 'P12:Q12', 'P13:Q13']
CodePudding user response:
Your list has some typos, should look something like this:
merged_cells_lst = [ 'P19:Q19', 'P20:Q20', 'P21:Q21', ...]
Then something as simple as:
x = [k for k in merged_cells_lst if k[0] == 'P']
would work. This is assuming that you know a priori that the pattern you want to remove follows the Pxx:Qxx format. If you want a dynamic solution then you can replace the condition in the list comprehension with a regex match.