I have a csv file in which there is column df['questions'] with JSON data
| Date | Agent Name | Questions |
| 8/5/2022 | Alaa M | the specified column in question please view the example below |
| 8/5/2022 | Othman M | the specified column in question please view the example below |
an example of the data in that column
[ {'id': 'dee52266-c096-47f4-96d4-6346498039ee', 'name': '1.G – Did an issue been raised?', 'displayOrder': 13, 'type': 'choice', 'multiSelect': False, 'questionsResponseModel': [{'id': 'a3e0ac59-5cc1-4654-a6bc-fbc71d86ba25', 'name': 'No'}], 'parentGroup': 'f1654f7c-204f-48d0-b940-ee9bb98eafa0', 'score': '0', 'maxScore': '0', 'percentage': '0'}, {'id': '6b0a92b4-fad9-488d-8296-030799ee00eb', 'name': '1.G - Comment', 'displayOrder': 14, 'type': 'text', 'multiSelect': None, 'questionsResponseModel': 'NA', 'parentGroup': 'f1654f7c-204f-48d0-b940-ee9bb98eafa0', 'score': '0', 'maxScore': '0', 'percentage': '0'} ]
import pandas as pd
import numpy as np
df = pd.read_csv('Desktop\Data.csv')
#first I tried to replace ' to " to view it as JSON however it is not working
def js(row):
#return row['questions'].lower().replace("'", '"')
df['new_questions'] = df.apply(js, axis=1)
df["new_questions_2"] = df["new_questions"].apply(json.loads)
#second tried to apply pd.series which also does not work
out = (df.drop(columns=['questions'])
.join(df['questions'].apply(pd.Series).add_prefix('questions_'))
)
CodePudding user response:
Try:
import ast
df["Questions"] = df["Questions"].apply(ast.literal_eval)
df = df.explode("Questions")
df = pd.concat([df, df.pop("Questions").apply(pd.Series)], axis=1)
df = df.explode("questionsResponseModel")
df = pd.concat(
[df, df.pop("questionsResponseModel").apply(pd.Series).add_prefix("qrm_")],
axis=1,
)
df = df.drop(columns="qrm_0")
print(df)
Prints:
Date Agent Name id name displayOrder type multiSelect parentGroup score maxScore percentage qrm_id qrm_name
0 8/5/2022 Alaa M dee52266-c096-47f4-96d4-6346498039ee 1.G – Did an issue been raised? 13 choice False f1654f7c-204f-48d0-b940-ee9bb98eafa0 0 0 0 a3e0ac59-5cc1-4654-a6bc-fbc71d86ba25 No
0 8/5/2022 Alaa M 6b0a92b4-fad9-488d-8296-030799ee00eb 1.G - Comment 14 text None f1654f7c-204f-48d0-b940-ee9bb98eafa0 0 0 0 NaN NaN
1 8/5/2022 Othman M dee52266-c096-47f4-96d4-6346498039ee 1.G – Did an issue been raised? 13 choice False f1654f7c-204f-48d0-b940-ee9bb98eafa0 0 0 0 a3e0ac59-5cc1-4654-a6bc-fbc71d86ba25 No
1 8/5/2022 Othman M 6b0a92b4-fad9-488d-8296-030799ee00eb 1.G - Comment 14 text None f1654f7c-204f-48d0-b940-ee9bb98eafa0 0 0 0 NaN NaN
Edit: "exploded" questionsResponseModel
column too.