I'm trying to find a way to pass a string (coming from outside the python world!) that can be interpreted as **kwargs
once it gets to the Python side.
I have been trying to use this pyparsing example, but the string thats being passed in this example is too specific, and I've never heard of pyparsing until now. I'm trying to make it more, human friendly and robust to small differences in spacing etc. For example, I would like to pass the following.
input_str = "a = [1,2], b= False, c =('abc', 'efg'),d=1"
desired_kwargs = {a : [1,2], b:False, c:('abc','efg'), d:1}
When I try this code though, no love.
from pyparsing import *
# Names for symbols
_quote = Suppress('"')
_eq = Suppress('=')
# Parsing grammar definition
data = (
delimitedList( # Zero or more comma-separated items
Group( # Group the contained unsuppressed tokens in a list
Regex(u'[^=,)\s] ') # Grab everything up to an equal, comma, endparen or whitespace as a token
Optional( # Optionally...
_eq # match an =
_quote # a quote
Regex(u'[^"]*') # Grab everything up to another quote as a token
_quote) # a quote
) # EndGroup - will have one or two items.
)) # EndList
def process(s):
items = data.parseString(s).asList()
args = [i[0] for i in items if len(i) == 1]
kwargs = {i[0]:i[1] for i in items if len(i) == 2}
return args,kwargs
def hello_world(named_arg, named_arg_2 = 1, **kwargs):
print(process(kwargs))
hello_world(1, 2, "my_kwargs_are_gross = True, some_bool=False, a_list=[1,2,3]")
#output: "{my_kwargs_are_gross : True, some_bool:False, a_list:[1,2,3]}"
Requirements:
- The '
{'
and'}'
will be appended on the code side. - Only standard types / standard iterables (list, tuple, etc) will be used in the kwargs-string. No special characters that I can think of...
- The kwargs-string will be like they are entered into a function on the python side, ie,
'x=1, y=2'
. Not as a string of a dictionary. - I think its a safe assumption that the first step in the string parse will be to remove all whitespace.
CodePudding user response:
One option could be to use the ast module to parse some wrapping of the string that turns it into a valid Python expression. Then you can even use ast.literal_eval
if you’re okay with everything it can produce:
>>> import ast
>>> kwargs = "a = [1,2], b= False, c =('abc', 'efg'),d=1"
>>> expr = ast.parse(f"dict({kwargs}\n)", mode="eval")
>>> {kw.arg: ast.literal_eval(kw.value) for kw in expr.body.keywords}
{'a': [1, 2], 'b': False, 'c': ('abc', 'efg'), 'd': 1}
CodePudding user response:
Since the format of your input string is already a valid Python argument list, you don't have to reinvent the wheel with pyparsing
but can simply enclose the string in a dict
constructor for eval
to create the desired kwargs
:
desired_kwargs = eval(f'dict({input_str})')
However, evaluating a string from an outside world comes with the security risk of code injection. Since any actual harm can only be done by making a function call, an easy way to avoid the security risk is to parse the code with ast.parse
and use ast.walk
to invalidate the AST if it contains more than one ast.Call
node (there has to be exactly one ast.Call
node since we are making a call to the dict
constructor):
import ast
code = f'dict({input_str})'
assert sum(isinstance(node, ast.Call) for node in ast.walk(ast.parse(code))) == 1
desired_kwargs = eval(code)
Demo: https://replit.com/@blhsing/OrnateScarceShelfware
CodePudding user response:
You already have some good answers (much easier than this one) if the string you are being passed is well-behaved Python. But if you don't trust the input and/or want to define something a little different, then being explicit about the format you expect may be desirable. In that case, pyparsing is quite useful and readable. The grammar from the question you linked isn't complex enough to handle all your cases, but if you break your grammar out into its constituent elements it is relatively easy to build:
from pyparsing import *
string_arg = QuotedString("'", esc_char="\\", unquote_results=False) | QuotedString("\"", esc_char="\\", unquote_results=False)
number_arg = Word(nums) | Word(nums) "." Word(nums)
boolean_arg = Literal("True") | Literal("False")
array_item = string_arg | number_arg
array_list = delimitedList(array_item)
array_arg = Literal("[") array_list Literal("]")
tuple_arg = Literal("(") array_list Literal(")")
arg_name = Word(identchars, identbodychars)
arg_value = string_arg | number_arg | boolean_arg | tuple_arg | array_arg
arg_item = arg_name Literal("=").suppress() arg_value
arg_list = delimitedList(arg_item)
def parseActionValue(string, location, tokens):
emit_tokens = []
if tokens[0] == '[':
emit_tokens = [eval('[' ','.join(tokens[1:-1]) ']')]
elif tokens[0] == '(':
emit_tokens = eval('(' ','.join(tokens[1:-1]) ')')
else:
emit_tokens = eval(tokens[0])
return emit_tokens
arg_value.setParseAction(parseActionValue)
def construct_args(s):
arr = arg_list.parse_string(s, parse_all=True)
args = {}
for i in range(0,len(arr),2):
args[arr[i]] = arr[i 1]
return args
Where you want to do something a little different or do verification that the tokens look like you expect, you add another setParseAction
on the element that you want to work with and emit the Python objects you want in the dict.
CodePudding user response:
python-makefun provides can parse these sorts of strings and may be useful for whatever the use case of the original question is:
import inspect
import makefun
def process_signature(sig: str) -> dict:
sig = f"f({sig})"
f = makefun.create_function(sig, (lambda: None))
result = {}
for name, arg in inspect.signature(f).parameters.items():
result[name] = arg.default
return result
process_signature("a = [1,2], b= False, c =('abc', 'efg'),d=1")
That outputs the desired result: {'a': [1, 2], 'b': False, 'c': ('abc', 'efg'), 'd': 1}