I have the following "json style" file (excerpt). It comes from reading an API.
[
{
"allSites": {
"activeLicenses": 75660,
"totalLicenses": 0
},
"sites": [
{
"accountId": "56378637863",
"accountName": "XX | ACME",
"activeLicenses": 0,
"createdAt": "2021-01-12T20:04:12.166693Z",
"creator": null,
"creatorId": null,
"description": null,
"expiration": null,
"externalId": null,
"healthStatus": true,
"id": "8637863",
"isDefault": true,
"licenses": {
"bundles": [
{
"displayName": "Control",
"majorVersion": 1,
"minorVersion": 1,
"name": "control",
"surfaces": [
{
"count": 50000,
"name": "Total Agents"
}
],
"totalSurfaces": 50000
},
{
"displayName": "Complete",
"majorVersion": 1,
"minorVersion": 1,
"name": "complete",
"surfaces": [
{
"count": 50000,
"name": "Total Agents"
}
],
"totalSurfaces": 50000
}
],
"modules": [
{
"displayName": "STAR",
"majorVersion": 1,
"name": "star"
},
{
"displayName": "Binary Vault - Malicious Files",
"majorVersion": 1,
"name": "binary_vault_malicious"
}
],
"settings": [
{
"displayName": "14 Days",
"groupName": "dv_retention",
"setting": "14 Days",
"settingGroup": "dv_retention",
"settingGroupDisplayName": "Deep Visibility Data Retention"
},
{
"displayName": "365 Days",
"groupName": "malicious_data_retention",
"setting": "365 Days",
"settingGroup": "malicious_data_retention",
"settingGroupDisplayName": "Malicious Data Retention"
},
{
"displayName": "Enabled",
"groupName": "remote_shell_availability",
"setting": "Enabled",
"settingGroup": "remote_shell_availability",
"settingGroupDisplayName": "Remote Shell"
},
{
"displayName": "Available",
"groupName": "marketplace_access_status",
"setting": "Available",
"settingGroup": "marketplace_access_status",
"settingGroupDisplayName": "Marketplace Access"
}
]
},
"name": "Default site",
"registrationToken": "7387893783783783==",
"siteType": "Paid",
"sku": "Complete",
"state": "active",
"suite": "Complete",
"totalLicenses": 100000,
"unlimitedExpiration": true,
"unlimitedLicenses": false,
"updatedAt": "2022-05-22T09:26:14.941439Z"
},
{
"accountId": "753733783873",
"accountName": "Core Account",
"activeLicenses": 0,
"createdAt": "2021-01-15T12:21:41.106290Z",
"creator": "JDOE",
"creatorId": "63838383",
"description": null,
"expiration": "2021-01-22T02:00:00Z",
"externalId": "348637889383",
"healthStatus": true,
"id": "3783838737893",
"isDefault": true,
"licenses": {
"bundles": [
{
"displayName": "Core",
"majorVersion": 1,
"minorVersion": 1,
"name": "core",
"surfaces": [
{
"count": 1,
"name": "Total Agents"
}
],
"totalSurfaces": 1
}
],
"modules": [],
"settings": [
{
"displayName": "365 Days",
"groupName": "malicious_data_retention",
"setting": "365 Days",
"settingGroup": "malicious_data_retention",
"settingGroupDisplayName": "Malicious Data Retention"
},
{
"displayName": "Available",
"groupName": "marketplace_access_status",
"setting": "Available",
"settingGroup": "marketplace_access_status",
"settingGroupDisplayName": "Marketplace Access"
}
]
},
"name": "Default site",
"registrationToken": "3783838383373==",
"siteType": "Trial",
"sku": "Core",
"state": "expired",
"suite": "Core",
"totalLicenses": 1,
"unlimitedExpiration": false,
"unlimitedLicenses": false,
"updatedAt": "2022-05-22T09:26:14.941439Z"
} ]
I'd like to write certain information out of that into an excel sheet with:
df = pd.DataFrame(data)
df.head()
df[['accountId', 'accountName', 'name', 'siteType', 'externalId',
'activeLicenses', 'totalLicenses', 'sku', 'state']].to_excel(completename)
Error message received:
KeyError: "None of [Index(['accountId', 'accountName', 'name', 'siteType', 'externalId',\n 'activeLicenses', 'totalLicenses', 'sku', 'state'],\n dtype='object')] are in the [columns]"
how can I access the elements stated above in that output? thanks
CodePudding user response:
In spyder, your dict appears in the wrong format. If you do not get an error in the dictionary, you can proceed to the second step. There are missing parentheses. this is the correct format (at least the one that works without error for me):
a=[
{
"allSites": {
"activeLicenses": 75660,
"totalLicenses": 0
},
"sites": [
{
"accountId": "56378637863",
"accountName": "XX | ACME",
"activeLicenses": 0,
"createdAt": "2021-01-12T20:04:12.166693Z",
"creator": null,
"creatorId": null,
"description": null,
"expiration": null,
"externalId": null,
"healthStatus": true,
"id": "8637863",
"isDefault": true,
"licenses": {
"bundles": [
{
"displayName": "Control",
"majorVersion": 1,
"minorVersion": 1,
"name": "control",
"surfaces": [
{
"count": 50000,
"name": "Total Agents"
}
],
"totalSurfaces": 50000
},
{
"displayName": "Complete",
"majorVersion": 1,
"minorVersion": 1,
"name": "complete",
"surfaces": [
{
"count": 50000,
"name": "Total Agents"
}
],
"totalSurfaces": 50000
}
],
"modules": [
{
"displayName": "STAR",
"majorVersion": 1,
"name": "star"
},
{
"displayName": "Binary Vault - Malicious Files",
"majorVersion": 1,
"name": "binary_vault_malicious"
}
],
"settings": [
{
"displayName": "14 Days",
"groupName": "dv_retention",
"setting": "14 Days",
"settingGroup": "dv_retention",
"settingGroupDisplayName": "Deep Visibility Data Retention"
},
{
"displayName": "365 Days",
"groupName": "malicious_data_retention",
"setting": "365 Days",
"settingGroup": "malicious_data_retention",
"settingGroupDisplayName": "Malicious Data Retention"
},
{
"displayName": "Enabled",
"groupName": "remote_shell_availability",
"setting": "Enabled",
"settingGroup": "remote_shell_availability",
"settingGroupDisplayName": "Remote Shell"
},
{
"displayName": "Available",
"groupName": "marketplace_access_status",
"setting": "Available",
"settingGroup": "marketplace_access_status",
"settingGroupDisplayName": "Marketplace Access"
}
]
},
"name": "Default site",
"registrationToken": "7387893783783783==",
"siteType": "Paid",
"sku": "Complete",
"state": "active",
"suite": "Complete",
"totalLicenses": 100000,
"unlimitedExpiration": true,
"unlimitedLicenses": false,
"updatedAt": "2022-05-22T09:26:14.941439Z"
},
{
"accountId": "753733783873",
"accountName": "Core Account",
"activeLicenses": 0,
"createdAt": "2021-01-15T12:21:41.106290Z",
"creator": "JDOE",
"creatorId": "63838383",
"description": null,
"expiration": "2021-01-22T02:00:00Z",
"externalId": "348637889383",
"healthStatus": true,
"id": "3783838737893",
"isDefault": true,
"licenses": {
"bundles": [
{
"displayName": "Core",
"majorVersion": 1,
"minorVersion": 1,
"name": "core",
"surfaces": [
{
"count": 1,
"name": "Total Agents"
}
],
"totalSurfaces": 1
}
],
"modules": [],
"settings": [
{
"displayName": "365 Days",
"groupName": "malicious_data_retention",
"setting": "365 Days",
"settingGroup": "malicious_data_retention",
"settingGroupDisplayName": "Malicious Data Retention"
},
{
"displayName": "Available",
"groupName": "marketplace_access_status",
"setting": "Available",
"settingGroup": "marketplace_access_status",
"settingGroupDisplayName": "Marketplace Access"
}
]
},
"name": "Default site",
"registrationToken": "3783838383373==",
"siteType": "Trial",
"sku": "Core",
"state": "expired",
"suite": "Core",
"totalLicenses": 1,
"unlimitedExpiration": false,
"unlimitedLicenses": false,
"updatedAt": "2022-05-22T09:26:14.941439Z"}
]
}]
after providing this format, you should run the following code:
df = pd.DataFrame(a)
mask = df.explode('sites', ignore_index=True)
df=mask.join(pd.DataFrame([*mask.pop('sites')], index=mask.index))
or(better):
s = df.explode(['sites']).reset_index()
final = pd.json_normalize(s["sites"])
now you can select specific columns. it was like this at first