Home > Net >  How to convert string dictionary value into dict type in Python
How to convert string dictionary value into dict type in Python

Time:09-25

I am reading text from files and text looks like this:

"(id=336346860, name='Western Australia', slug='western-australia', has_public_page=True, lat=-26.0, lng=121.0)"

I would like to convert this to a dict. I tried to convert it into dict type but it's giving error:

fileOutput = "(id=336346860, name='Western Australia', slug='western-australia', has_public_page=True, lat=-26.0, lng=121.0)"
x = dict(fileOutput)

Error:

ValueError: dictionary update sequence element #0 has length 1; 2 is required

Can someone help to find out a solution for this?

CodePudding user response:

A more robust approach would be to prefix the string with an identifier, such as _, to make it a valid Python syntax for a function call, then use ast.parse to parse the string as Python code, traverse the code tree with ast.walk, and look for the ast.Call node, where there is the keywords attribute with a list of keyword arguments, from which you can extract the name from the arg attribute and the value from the value attribute. Since the value attribute itself can be an expression such as -26.0 in your sample input, consisting of a constant of 26.0 and unary operation of -, you can use ast.literal_eval to evaluate the node to convert it to the value it represents:

{
    keyword.arg: ast.literal_eval(keyword.value)
    for node in ast.walk(ast.parse('_'   fileOutput)) if isinstance(node, ast.Call)
    for keyword in node.keywords
}

With your sample input, this returns:

{'id': 336346860, 'name': 'Western Australia', 'slug': 'western-australia', 'has_public_page': True, 'lat': -26.0, 'lng': 121.0}

CodePudding user response:

You can do something with ast.parse. Parse the string into the constructor of any function (doesn't have to be dict), then extract the keyword arguments. For example, start with

>>> mod = ast.parse('dict'   fileOutput)
>>> print(ast.dump(mod, indent=4))
Module(
    body=[
        Expr(
            value=Call(
                func=Name(id='dict', ctx=Load()),
                args=[],
                keywords=[
                    keyword(
                        arg='id',
                        value=Constant(value=336346860)),
                    keyword(
                        arg='name',
                        value=Constant(value='Western Australia')),
                    keyword(
                        arg='slug',
                        value=Constant(value='western-australia')),
                    keyword(
                        arg='has_public_page',
                        value=Constant(value=True)),
                    keyword(
                        arg='lat',
                        value=UnaryOp(
                            op=USub(),
                            operand=Constant(value=26.0))),
                    keyword(
                        arg='lng',
                        value=Constant(value=121.0))]))],
    type_ignores=[])

You can now extract the keywords pretty easily. You can expect arbitary trees even in the arguments, so you will have to apply ast.literal_eval to each keyword independently. This is not particularly difficult.

First sanitize the input a little to make sure it at least appears to be a call to the dict constructor (or whatever function name you prepended):

if len(mod.body) > 1 or not isinstance(call := mod.body[0].value, ast.Call) or call.func.id != 'dict':
    raise ValueError('Not just one dict')
if call.args:
    raise ValueError('Why are there positional args?')

Now you can extract the keywords:

>>> {x.arg: ast.literal_eval(x.value) for x in call.keywords}
{'id': 336346860,
 'name': 'Western Australia',
 'slug': 'western-australia',
 'has_public_page': True,
 'lat': -26.0,
 'lng': 121.0}

ast.literal_eval will crash if anyone tries to sneak in arbitrary function calls.

TL;DR

def parse_line(line):
    mod = ast.parse('dict'   fileOutput)
    if len(mod.body) > 1 or not isinstance(call := mod.body[0].value, ast.Call) or call.func.id != 'dict':
        raise ValueError('Not just one dict')
    if call.args:
        raise ValueError('Why are there positional args?')
    return {x.arg: ast.literal_eval(x.value) for x in call.keywords}

CodePudding user response:

I built a custom class to fit into your requirements based on few assumptions listed below:

  1. Input always starts and ends with parenthesis ().
  2. Input may only include "" (empty string) or "()" (empty parenthesis) or the actual values like "(id=336346860, name='Western Australia', slug='western-australia', has_public_page=True, lat=-26.0, lng=121.0)".
  3. Values will be only among python supported str, bool, int, float.
  4. Key-value pairs are always seperated by =.
  5. , (comma) is not a part of value. (ie. comma is not present anywhere in values

If any one of the above assumptions is broken, the class may not work as expected


Code is given below:

from typing import Optional


class MyDict:
    def setRawElements(self):
        """Create a list by splitting the given string"""
        # Assumption #5
        # If there is any comma in the value, then the split may be inconsistent
        self.raw_elements = self.string.split(", ")

    def splitKeyValuePairs(self):
        """Split into key value pairs and create a internal dictionary"""
        for elem in self.raw_elements:
            # Assumption #4
            # If the key and the value is not seperated by '=', then the split may be inconsistent
            key, value = elem.split("=")
            self.dictionary[key] = value

    def setKeyTypes(self):
        """Type conversion"""
        for key, value in self.dictionary.items():
            # Assumption #3
            # Value must be one among (bool, str, float, int)
            if value in ["True", "False"]:
                # check if the value is a boolean [True, False]
                type_ = bool
            elif value and value[0] == value[-1] == "'":
                # check if the value is a str object
                self.dictionary[key] = self.dictionary[key][1:-1]
                # we need not convert a str to str, so we can skip the conversion part
                continue
            elif "." in value:
                # float values will have two parts, integer and fraction seperated by a period
                type_ = float
            else:
                # if above mentioned cases are not matched, ww assume that the type is int
                type_ = int
            # type conversion from str to excpected type
            self.dictionary[key] = type_(self.dictionary[key])

    def parse(self, string):
        self.dictionary = {}
        self.string = string
        if string and string[1:-1]:
            # Assumption #1 and #2
            # If string is not empty and not just empty parenthesis
            self.string = self.string[1:-1]  # remove parenthesis from start and end
            self.setRawElements()
            self.splitKeyValuePairs()
            self.setKeyTypes()
        return self.dictionary

    def __new__(cls, string: str) -> Optional[dict]:
        """Calling a class will return parsed dictionary"""
        return super().__new__(cls).parse(string)

To use the class, refer the below code:

fileOutput = "(id=336346860, name='Western Australia', slug='western-australia', has_public_page=True, lat=-26.0, lng=121.0)"
x = MyDict(fileOutput)
print(x)

Below is the output:

{'id': 336346860, 'name': 'Western Australia', 'slug': 'western-australia', 'has_public_page': True, 'lat': -26.0, 'lng': 121.0}

To check the types of the values, refer the below code:

for key, value in x.items():
    print(key, value, type(value), sep=" - ")

Output:

id - 336346860 - <class 'int'>
name - Western Australia - <class 'str'>
slug - western-australia - <class 'str'>
has_public_page - True - <class 'bool'> 
lat - -26.0 - <class 'float'>
lng - 121.0 - <class 'float'>
  • Related