I'm trying to make a quick script to streamline some boring accounting. Basically I have a folder full of files with names similar to what is contained in the list below.
I need to rename the files as indicated in the first couple of file names.
I have a clear idea about how to do this, and was writting a quick script to get it done. But I hit on a bit of a silly problem. I want to uses a list comprehension to get a list of the dates, sorta as illustrated in the last line. Ideally what I want to do would be would be:
[re.search(date_pattern, file).match for file in list_of_reciepts]
But this fails on filenames which are missing a date field.
Any thoughts on a nice neat alternative?
import re
list_of_reciepts = [
'2021-10-18 1.pdf',
'2021-10-18 2.pdf',
'2021-10-18 3.pdf',
'Financial History - Linkt.pdf',
'Scan from 2021-10-04 05_14_16 PM.pdf',
'Scan from 2021-10-07 11_41_26 AM.pdf',
'Scan from 2021-10-19 05_13_22 PM.pdf',
]
date_pattern = re.compile(r'\d{4}-\d{2}-\d{2}')
[re.search(date_pattern, file) for file in list_of_reciepts]
>>>[<re.Match object; span=(0, 10), match='2021-10-18'>,
<re.Match object; span=(0, 10), match='2021-10-18'>,
<re.Match object; span=(0, 10), match='2021-10-18'>,
None,
<re.Match object; span=(10, 20), match='2021-10-04'>,
<re.Match object; span=(10, 20), match='2021-10-07'>,
<re.Match object; span=(10, 20), match='2021-10-19'>]
CodePudding user response:
If you use Python >= 3.8, you can use walrus operator:
>>> [sre.group() for file in list_of_reciepts
if (sre := re.search(date_pattern, file))]
['2021-10-18',
'2021-10-18',
'2021-10-18',
'2021-10-04',
'2021-10-07',
'2021-10-19']
For Python < 3.8, use a double comprehension:
>>> [sre.group() for sre in [re.search(date_pattern, file)
for file in list_of_reciepts] if sre]
['2021-10-18',
'2021-10-18',
'2021-10-18',
'2021-10-04',
'2021-10-07',
'2021-10-19']
If you want to keep None
:
>>> [sre.group() if (sre := re.search(date_pattern, file)) else None
for file in list_of_reciepts]
['2021-10-18',
'2021-10-18',
'2021-10-18',
None,
'2021-10-04',
'2021-10-07',
'2021-10-19']
CodePudding user response:
Use the walrus operator
res = [x.group() if (x := re.search(date_pattern, file)) else None for file in list_of_reciepts]
print(res)
Output
['2021-10-18', '2021-10-18', '2021-10-18', None, '2021-10-04', '2021-10-07', '2021-10-19']
As an alternative, since you are compiling the regular expression, you could use map as below:
res = [match.group() if match else match for match in map(date_pattern.search, list_of_reciepts)]
CodePudding user response:
You can use getattr
for a cleaner and shorter approach than using an assignment expression with a conditional:
import re
v = ['2021-10-18 1.pdf', '2021-10-18 2.pdf', '2021-10-18 3.pdf', 'Financial History - Linkt.pdf', 'Scan from 2021-10-04 05_14_16 PM.pdf', 'Scan from 2021-10-07 11_41_26 AM.pdf', 'Scan from 2021-10-19 05_13_22 PM.pdf']
p = re.compile(r'\d{4}-\d{2}-\d{2}')
r = [getattr(re.search(p, i), 'group', lambda :None)() for i in v]
Output:
['2021-10-18', '2021-10-18', '2021-10-18', None, '2021-10-04', '2021-10-07', '2021-10-19']