Home > Enterprise >  Convert a pandas data frame filled with values in curly brackets to numpy array
Convert a pandas data frame filled with values in curly brackets to numpy array

Time:08-19

I have a Pandas dataframe with values in curly brackets, and I want to convert it to a Pandas dataframe with the same values but instead of curly brackets, they have to be converted to NumPy arrays. This is an example of an instance of my dataframe: An instance of the dataframe

0, 5, '{{{1., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
   '{{{0., 0.}, {1., 0.}}, {{0.3333333333333333, 0.}, {0., 1.}}}',
   '{{{0., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
   '{0., 0.041666666666666664, 0., 0., 0.}', '{0., 0., 2., 1.}'

I want this instance of the dataframe to be like this:

0, 5, array([[[1., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
   array([[[0., 0.], [1., 0.]], [[0.3333333333333333, 0.], [0., 1.]]]),
   array([[[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
   array([0., 0.041666666666666664, 0., 0., 0.]), array([0., 0., 2., 1.])

CodePudding user response:

Okay, I took the liberty of assuming those curly brackets in your original DataFrame are strings.

You can use a combination of a lambda expression and ast.literal_eval(x).

import ast
import numpy as np
import pandas as pd

df = df.applymap(lambda x: np.array(ast.literal_eval(str(x).replace('{', '[').replace('}', ']')), 
                                    dtype=object))

This expression applies a function which first converts a value to string. It then replaces '{' with '[' and '}' with ']' and after that it uses ast.literal_eval to convert a string to a list. np.array is there if you really want it to be a numpy array but it isn't necessary.

From another answer:

With ast.literal_eval you can safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, booleans, and None.

CodePudding user response:

You can simply:

  1. replace all { with [ and replace all } with ] and use python eval function to convert it into a python list
  2. create a np.array() from python list
import numpy as np
import pandas as pd

data = pd.Series(['0', '5', '{{1.,0.},{0.,0.},{0.,0.}}', '2', '{{4.5, 5}, {0.3, 0.6}}', '200'])
data = data.apply(lambda x: np.array(eval(str(x).replace('{', '[').replace('}', ']'))) if '{' in str(x) else float(x))
print(data)
  • Related