I'm trying to get the value of the variable abcd
in a specific string. Trying to use regex module. Below is the code.
import re
text = '''
<script>var abcd = {"wishlist":{}}
'''
regex = 'abcd = (.*?)'
pattern = re.compile(
regex, re.MULTILINE | re.DOTALL
)
value = pattern.search(text).group(1)
print(value) # prints empty string. The value should be rather `{"wishlist":{}}`
What am I doing wrong?
CodePudding user response:
Regex will be wrong tool to parse Javascript code. An alernative solution would be to first, extract the contents of script tag(For that you can use BeautifulSoup
).
And then feed that text into a Javascript Parser(You can use slimit
which includes javaScript parser, lexer, pretty printer and a tree visitor), then walk over the AST to parse the variables and it's values. (Here is the pseudo code)
>>> from slimit.parser import Parser
>>> from slimit.visitors import nodevisitor
>>> from slimit import ast
>>>
>>> parser = Parser()
>>> tree = parser.parse('var abcd = {"wishlist":{}} ')
>>> for node in nodevisitor.visit(tree):
... if isinstance(node, ast.Assign):
... print(node) # Do something with the node
CodePudding user response:
Code:
import re
text = ''' <script>var abcd = {"wishlist":{}} '''
regex = r'var abcd = (.*)'
match = re.search(regex, text)
if match:
print(match.group(1))
else: print("Match abcd not found.")
# Output: {"wishlist":{}}
Simply use a variable inside the regex.