Is there a way to detect different information from closed brackets? Example:
Text = Hello World, this is my name (Greta). For this example (i need an example in a bracket)
In my case I would like to only detect the brackets with more than one word inside using a regex.
Output: I need an example in a bracket
CodePudding user response:
\((\w \s \w [^)]*)\)
is probably close to what you want.
\(
is an open-paren literal and\)
is a close-paren literal(...)
creates a capturing group, so we can get only the inside of the parens\w
matches alphanumeric characters\s
matches whitespace characters.[^)]
is the character group of anything other than a close-paren)
[^)]*
selects any number of non-)
characters (including 0)\)
matches the close paren.- Taken together, we check for a
(
, then a alphanumeric word, then one or more spaces, then another alphanumeric word, then take as much as we need of things that aren't)
, and then a close paren.
Note that this fails in cases like Hello (Alice and (hello) Bob)
, where it will capture Alice and (hello
. (This is a fundamental limitation to regexes, you will have to use another method if you need to parse nested parentheses)
CodePudding user response:
You could use:
\(\s*[A-Za-z] \s[A-Za-z\s] \))
Which will look for:
\(
: open parenthesis\s*
: zero or more spaces[A-Za-z]
: followed by one or more letters\s
: followed by a space[A-Za-z\s]
: followed by one or more letters and/or spaces\)
: close parenthesis
Code:
import re
text='Hello World, this is my name (Greta). For this example (i need an example in a bracket)'
m = re.findall(r'(\(\s*[A-Za-z] \s[A-Za-z\s] \))', text)
for match in m:
print(match[1:-1])
Output:
i need an example in a bracket
CodePudding user response:
You have to analuze what do you really need from the text and start from the basic. I did it in this way
/\([a-zA-Z ]*\)
I would also recommend you this tool to try it in case you need