I am trying to extract substrings that come before either a left parenthesis or a dot with regexp()
. For example
example1='qwer(1).asdf; qwer(1).zxcv;';
example2='qwer.asdf; qwer.zxcv;';
I tried
expression='(?<varname>.*?)(\(\d \)){?}.';
expression='(?<varname>.*?)(?<others>\(\d \)){?}.';
expression='(?<varname>.*?)\(';
expression='(?<varname>.*?)(';
expression='(?<varname>.*?)/(';
with
parts=regexp(example1,expression,'names');
None worked.
How exactly does matching with parenthesis work in regexp()
?
The official documentation doesnt mention how to characters that form operators, quantifiers, etc.
CodePudding user response:
You can use
example1 = '01/11/2000 20-02-2020 03/30/2000 16-04-2020';
expression = '(?<varname>\w )[(.]';
parts=regexp(example1,expression,'names');
See the regex demo. Details:
(?<varname>\w )
- Group "varname": one or more word chars[(.]
- a(
or.
char.
CodePudding user response:
/(?<varname>(\S*?)(?:\(.*?;)|(\S*?)(?:\..*?;))/g
Here's an illustration of parentheses in regex below. I added the above to actually answer your question
/\(.*?\)/g
This will match the parentheses and everything between for each
The \ escapes the parenthesis, the *? matches the smallest amount so it doesn't match everything between the first and last parentheses and the global flag does it for each occurrence
https://regex101.com/ is a great resource