I haven't found any helpful Regex tools to help me figure this complicated pattern out.
I have the following string:
Myfirstname Mylastname, Department of Mydepartment, Mytitle, The University of Me; 4-1-1, Hong,Bunk, Tokyo 113-8655, Japan E-mail:[email protected], Tel:00-00-222-1171, Fax:00-00-225-3386
I am trying to learn enough Regex patterns to remove the substrings one at a time:
E-mail:[email protected]
Tel:00-00-222-1171
Fax:00-00-225-3386
So I think the correct pattern would be to remove a given word (ie., "E-mail", "Tel") all the way through the following comma.
Is type of dynamic pattern possible in Regex?
I am performing the match in Python, however, I don't think that would matter too much.
Also, I know the data string looks comma separated, and it is. However there is no guarantee of preserving the order of those fields. That's why I'm trying to use a Regex match.
CodePudding user response:
How about this regex:
<YOUR_WORD>.*?(?=(,|($)))
Explanation:
- It looks for the word specified in
<YOUR_WORD>
placeholder - It looks for any kind of character afterwards
- The search stops when it hits one of the two options:
- It finds the character
,
- It finds an end of the line
- It finds the character
So:
E-mail.*?(?=(,|($)))
Will result in:
E-mail:[email protected]
And
Fax.*?(?=(,|($)))
Will result in:
Fax:00-00-225-3386
If there are edge cases it misses - I would like to know, and whether it affects the performance/ is necessary.