I have a Pandas dataframe with values in curly brackets, and I want to convert it to a Pandas dataframe with the same values but instead of curly brackets, they have to be converted to NumPy arrays. This is an example of an instance of my dataframe: An instance of the dataframe
0, 5, '{{{1., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
'{{{0., 0.}, {1., 0.}}, {{0.3333333333333333, 0.}, {0., 1.}}}',
'{{{0., 0.}, {0., 0.}}, {{0., 0.}, {0., 0.}}}',
'{0., 0.041666666666666664, 0., 0., 0.}', '{0., 0., 2., 1.}'
I want this instance of the dataframe to be like this:
0, 5, array([[[1., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
array([[[0., 0.], [1., 0.]], [[0.3333333333333333, 0.], [0., 1.]]]),
array([[[0., 0.], [0., 0.]], [[0., 0.], [0., 0.]]]),
array([0., 0.041666666666666664, 0., 0., 0.]), array([0., 0., 2., 1.])
CodePudding user response:
Okay, I took the liberty of assuming those curly brackets in your original DataFrame are strings.
You can use a combination of a lambda expression and ast.literal_eval(x)
.
import ast
import numpy as np
import pandas as pd
df = df.applymap(lambda x: np.array(ast.literal_eval(str(x).replace('{', '[').replace('}', ']')),
dtype=object))
This expression applies a function which first converts a value to string. It then replaces '{'
with '['
and '}'
with ']'
and after that it uses ast.literal_eval
to convert a string to a list
. np.array
is there if you really want it to be a numpy
array but it isn't necessary.
From another answer:
With
ast.literal_eval
you can safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, booleans, and None.
CodePudding user response:
You can simply:
- replace all
{
with[
and replace all}
with]
and use pythoneval
function to convert it into a pythonlist
- create a
np.array()
from python list
import numpy as np
import pandas as pd
data = pd.Series(['0', '5', '{{1.,0.},{0.,0.},{0.,0.}}', '2', '{{4.5, 5}, {0.3, 0.6}}', '200'])
data = data.apply(lambda x: np.array(eval(str(x).replace('{', '[').replace('}', ']'))) if '{' in str(x) else float(x))
print(data)