Home > Net >  How to convert key and value of dictionary from byte to string?
How to convert key and value of dictionary from byte to string?

Time:05-06

I have dictionary which is in encoded format. There can be nested dictionary and I do not have information about how much nested it can be.

Sample of data look like this

 1:{
      b'key1':{
         b'key11':2022,
         b'key12':1,
         b'key13':2022,
         b'key32':1,
         b'key14':b'x86\xe3\x88',
         b'key21':b'U_001776',
         b'key34':b'\xe6\xb4\xbe\xe9\x81\xa3\xe7\xa4\xbe\xe5\x93\xa1',
         b'key65':b'U_001506',
         b'key45':b'\xbc',
         b'key98':b'1\x81\x88'b'kwy66':{
            b'keyq':b'sometext'
         }
      }
   },

To convert this into string I tried this


def convert_dict(data):
    if isinstance(data,str):
        return data
    elif isinstance(data,bytes):
        return data.decode()
    elif isinstance(data,dict):
        for key,val in data.items():
            if isinstance(key,bytes):
                data[key.decode()] = convert_dict(data[key])
            else:
                data[key] = convert_dict(data[key])
        return data
    elif isinstance(data,list):
        temp_list = []
        for dt in data:
            temp_list.append(convert_dict(dt))
        return temp_list
    else:
        return data

I am getting dictionary changed size during iteration. Is there any mistake in this? Please help.

Edit1.

Data is actually serialized in php and I had to use python to unserialize. I used This to convert it in Dictionary.

from phpserialize import *
temp = loads(serialized_data.encode())

I received dictionary but its key and values are encoded. I had to use serialized_data.encode() because loads will accept byte data type. I am passing this temp to convert_dict function.

CodePudding user response:

You can't modify the set of keys in a dict while iterating (it's unsafe to modify basically any collection while iterating it, but dicts, unlike some others, need to do some self-checking to avoid crashing if you violate that rule, so while they're at it, they raise an exception rather than silently doing screwy things). So build a new dict and return that instead:

def convert_dict(data):
    if isinstance(data,str):
        return data
    elif isinstance(data,bytes):
        return data.decode()
    elif isinstance(data,dict):
        newdata = {}  # Build a new dict
        for key, val in data.items():
            # Simplify code path by just doing decoding in conditional, insertion unconditional
            if isinstance(key,bytes):
                key = key.decode()
            newdata[key] = convert_dict(val)  # Update new dict (and use the val since items() gives it for free)
        return newdata
    elif isinstance(data,list):
        return [convert_dict(dt) for dt in data]
    else:
        return data

Just for fun, I made some minor modifications to reduce code repetition so most work is done through common paths, and demonstrated simplifying the list case with a listcomp.

CodePudding user response:

You can't change a dictionary that you are iterating over. It is better to return a new structure:

def convert(d):
    if isinstance(d, dict):
        return {convert(k): convert(v) for k, v in d.items()}
    if isinstance(d, list):
        return [convert(i) for i in d]
    if isinstance(d, bytes):
        return d.decode()
    return d
  • Related