I want to get the mail from the filenames. Here is a set of examples of filenames :
string1 = "[email protected]_2022-05-11T11_59_58 00_00.pdf"
string2 = "[email protected]_test.pdf"
string3 = "[email protected]"
I would like to split the filename by the parts. The first one would contain the email and the second one is the rest. So it should give for the string2 :
['[email protected]', '_test.pdf']
I try this regex function however it does not work for the second and third string.
email = re.search(r"[a-z0-9\.\- _] @[a-z0-9\.\- _] \.[a-z] ", string)
Thank you for your help
CodePudding user response:
Given the samples you provided, you can do something like this:
import re
strings = ["[email protected]_2022-05-11T11_59_58 00_00.pdf",
"[email protected]_test.pdf",
"[email protected]"]
pattern = r'([^@] @[\.A-Za-z] )(.*)'
[re.findall(pattern, string)[0] for string in strings]
Output:
[('[email protected]', '_2022-05-11T11_59_58 00_00.pdf'),
('[email protected]', '_test.pdf'),
('[email protected]', '-fdsdfsd-saf.pdf')]
Mail pattern explanation ([^@] @[\.A-Za-z] )
:
[^@]
: any combination of characters except@
@
: at[\.A-Za-z]
: any combination of letters and dots
Rest pattern explanation (.*)
(.*)
: any combination of characters