I am trying to put other fields from the json after normalize the nested part. I had followed some examples like I need but I don't understand why the error : KeyError: 'phone Number'
In others words I need put another fields from the json in order to complet the neested fields and get a better structure in my panda dataframe
This is my code :
import json
import pandas as pd
data ={
"consumer": {
"phoneNumber": "3156578877",
"channelId": "83",
"appId": "APP_DAVIPLATA",
"moduleId": "MA_PSE_VNZ",
"sessionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"appVersion": "16.1.1",
"soVersion": "Android 11",
"agentInfo": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36",
"ipDevice": "163.111.221.230"
},
"transactionHeader": {
"transactionType": "LOG",
"transactionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"transactionDate": "2018-07-03T17:54:36.762-05:00"
},
"transactionDetail": {
"logType": "ANALITICA_OPERACIONAL",
"MediaTarjetId": "PSE_VNZ_1",
"Consumer": {
"phoneNumber": "3156578877",
"channelId": "83",
"appId": "APP_DAVIPLATA",
"moduleId": "MA_PSE_VNZ",
"sessionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"appVersion": "16.1.1",
"soVersion": "Android 11",
"agentInfo": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36",
"ipDevice": "163.111.221.230"
},
"Transaction": {
"transactionType": "LOG",
"transactionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"transactionDate": "2018-07-03T17:54:36.762-05:00"
},
"Client": {
"identificationType": "CC",
"identificationNumber": "1027868487",
"documentExpeditionDate": "2009-05-10",
"documentExpeditionPlace": "Bogota"
},
"Product": {
"productCode": "DVP_CO",
"productNumber": "3158765639"
},
"Messages": {
"OperationalAnalytics": [
{
"nameField": "fecha_transaccion",
"valueField": "2022071518:35:50",
"valueFormat": "YYYYMMDDHH:MM:SS"
},
{
"nameField": "nombre_transaccion",
"valueField": "DEBITO PAGO",
"valueFormat": "String"
},
{
"nameField": "valor",
"valueField": "5300",
"valueFormat": "Number"
},
{
"nameField": "referencia_destino",
"valueField": "3156547865",
"valueFormat": "String"
}
]
}
}
}
pd.json_normalize(data, record_path = ['transactionDetail','Messages','OperationalAnalytics'],meta = [['consumer','phoneNumber'], 'transactionHeader'])
KeyError: 'phoneNumber'
In this code some like I need it worked
import pandas as pd
data = [
{
"company": "Google",
"tagline": "Dont be evil",
"management": {"CEO": "Sundar Pichai"},
"department": [
{"name": "Gmail", "revenue (bn)": 123},
{"name": "GCP", "revenue (bn)": 400},
{"name": "Google drive", "revenue (bn)": 600},
],
},
{
"company": "Microsoft",
"tagline": "Be What's Next",
"management": {"CEO": "Satya Nadella"},
"department": [
{"name": "Onedrive", "revenue (bn)": 13},
{"name": "Azure", "revenue (bn)": 300},
{"name": "Microsoft 365", "revenue (bn)": 300},
],
},
]
result = pd.json_normalize(
data, "department", ["company", "tagline", ["management", "CEO"]]
)
result
Thanks for your help
CodePudding user response:
The json in your working example is a list of dictionaries while the one you are trying to transform is a dictionary.
You can access the dictionary elements directly with data[key]
and convert them to dataframes that you will concatenate... One tricky part of your example are the keys in 'OperationalAnalytics'. I guess you would want the nameField
values to be column names so that you get one line per transaction. This can be achieved by setting the nameField
column to index and transposing:
pd.concat([pd.DataFrame([data["consumer"]])["phoneNumber"],
pd.DataFrame([data["transactionHeader"]]),
pd.DataFrame(data['transactionDetail']['Messages']['OperationalAnalytics']
).set_index('nameField')[['valueField']].T.reset_index(drop=True)
], axis=1)
Output:
phoneNumber transactionType transactionId transactionDate fecha_transaccion nombre_transaccion valor referencia_destino
0 3156578877 LOG 3fa85f64-5717-4562-b3fc-2c963f66afa6 2018-07-03T17:54:36.762-05:00 2022071518:35:50 DEBITO PAGO 5300 3156547865
CodePudding user response:
Imagine some thing like this :
Output:
phoneNumber channelId appId moduleId sessionId appVersion soVersion agentInfo ipDevice nameField valueField 0 3156578877 83 APP_DAVIPLATA MA_PSE_VNZ 3fa85f64-5717-4562-b3fc-2c963f66afa6 16.1.1 Android 11 Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl... 163.111.221.230 referencia_destino 3156547865 1 3156578877 83 APP_DAVIPLATA MA_PSE_VNZ 3fa85f64-5717-4562-b3fc-2c963f66afa6 16.1.1 Android 11 Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl... 163.111.221.230 valor 5300 2 3156578877 83 APP_DAVIPLATA MA_PSE_VNZ 3fa85f64-5717-4562-b3fc-2c963f66afa6 16.1.1 Android 11 Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl... 163.111.221.230 nombre_transaccion DEBITO PAGO 3 3156578877 83 APP_DAVIPLATA MA_PSE_VNZ 3fa85f64-5717-4562-b3fc-2c963f66afa6 16.1.1 Android 11 Mozilla/5.0 (Windows NT 10.0; Win64; x64) Appl... 163.111.221.230 fecha_transaccion 2022071518:35:50