I am working on the following regex problem that I have almost solved.
The goal is to find the words that start either with special characters or a space and end with one of these words including the period .qvd .txt .xlsx
For example "list.xlsx random %ford.txt #catch.qvd cars roads"
From above string I need to extract the following list.xlsx , ford.txt and catch.qvd
[#%\S]\w \. txt
My solution only checks the words that end with .txt. How can I change my regex expression to include .qvd , and .xlsx too
CodePudding user response:
In this pattern [#%\S]\w \. txt
the \S
also matches $
and %
and is the same as \S\w \. txt
.
That would require a string that starts with a non whitespace char and will include "special chars" in the match, and the string must be at least 2 characters long.
If there can be either a "special char" or a space or the start of the string to the left, you can start the match directly with word characters, followed by matching any of the alternatives using a non capture group (?:txt|qvd|xlsx)
and a word boundary \b
at the end to prevent a partial word match.
\w \.(?:txt|qvd|xlsx)\b
CodePudding user response:
Use the |
(alternation operator/metacharacter) to express an "or" relation between two or more subexpressions:
(?<!\S)[#%]\S \.(?:txt|qvd|xslx)