I have the following array structure, which I consume from a .csv file
0,Done,"[{'id': '7-84-1811', 'idType': 'CIP', 'suscriptionId': '89877485'}]"
0,Done,"[{'id': '1-232-42', 'idType': 'IO', 'suscriptionId': '23532r32'}]"
0,Done,"[{'id': '2323p23', 'idType': 'LP', 'suscriptionId': 'e32e23dw'}]"
0,Done,"[{'id': 'AU23242', 'idType': 'LL', 'suscriptionId': 'dede143234'}]"
To be able to handle it with pandas, I created its respective columns, but I only need to access the "id" and "idType" properties.
My code
from pandas.io.json import json_normalize
import pandas as pd
path = 'path_file'
df_fet = pd.read_csv(path, names=['error', 'resul', 'fields'])
df_work = df_fet[['fields'][0]['id', 'idType']]
print(df_work.head())
Retorn error TypeError: string indices must be integers
desired output
id, idType
0. '7-84-1811', 'CIP'
1. '1-232-42', 'IO'
...
CodePudding user response:
Here's a way to achieve the desired output
import pandas as pd
path = 'filepath'
df = pd.read_csv(path, names=['error', 'resul', 'fields'])
df["fields"] = df["fields"].fillna("[]").apply(lambda x: eval(x))
arr = []
for row in df["fields"]:
arr.append([row[0]["id"], row[0]["idType"]])
new = pd.DataFrame(arr, columns=["id", "idType"])
print(new)
Output:
Using eval()
function python interprets the argument as a python expression thus the string is interpreted as a list itself