I'm doing api requests to get json file to be parsed and converted into data frames. Json file sometimes may have empty fields, I am posting 2 possible cases where 1st json fill have the field I am looking for and the 2nd json file has that field empty.
1st json file:
print(resp2)
{
"entityId": "proc_1234",
"displayName": "oracle12",
"firstSeenTms": 1639034760000,
"lastSeenTms": 1650386100000,
"properties": {
"detectedName": "oracle.sysman.gcagent.tmmain.TMMain",
"bitness": "64",
"jvmVendor": "IBM",
"metadata": [
{
"key": "COMMAND_LINE_ARGS",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0"
},
{
"key": "EXE_NAME",
"value": "java"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent*c/agent_*/oracle_common/jdk/bin/java"
},
{
"key": "JAVA_MAIN_CLASS",
"value": "oracle.sysman.gcagent.tmmain.TMMain"
},
{
"key": "EXE_PATH",
"value": "/usr/local/oracle/oem/agent12c/agent_13.3.0.0.0/oracle_common/jdk/bin/java"
}
]
}
}
2nd Json file:
print(resp2)
{
"entityId": "PROCESS_GROUP_INSTANCE-FB8C65551916D57D",
"displayName": "Windows System",
"firstSeenTms": 1619147697131,
"lastSeenTms": 1653404640000,
"properties": {
"detectedName": "Windows System",
"bitness": "32",
"metadata": [],
"awsNameTag": "Windows System",
"softwareTechnologies": [
{
"type": "WINDOWS_SYSTEM"
}
],
"processType": "WINDOWS_SYSTEM"
}
}
as you can see metadata": [] empty.
I need to extract entityId, detectedName and if metada has data, I need to get EXE_NAME and EXE_PATH. if metada section is empty, I still need to get the entityId and detectedName from this json file and form a data frame.
so, I have done this:
#retrieve the detecteName value from the json
det_name = list(resp2.get('properties','detectedName').values())[0]
#retrieve EXE_NAME, EXE_PATH and entityId from the json. This part works when metata section has data
Procdf=(pd.json_normalize(resp2, record_path=['properties', 'metadata'], meta=['entityId']).drop_duplicates(subset=['key']).query("key in ['EXE_NAME','EXE_PATH']").assign(detectedName=det_name).pivot('entityId', 'key', 'value').reset_index())
#Add detectedName to the Procdf data frame
Procdf["detectedName"] = det_name
this above code snippet works when metadata has data, if it has no data [], I still need to create a data frame with entityId, detectedName and EXE_NAME and EXE_PATH being empty.
how can I do this? Right now when metadat[], I get this error name 'key' is not defined and skipps that json.
CodePudding user response:
Why not create a new dict based on whether there's value for metadata
or not?
Here's an example (this should work with both response types):
import pandas as pd
def find_value(response: dict, key: str) -> str:
result = [
x['value'] for x in response['properties']['metadata']
if x['key'] == key
]
return result[0] if result else ""
def get_values(response: dict) -> dict:
return {
"entityId": response['entityId'],
"displayName": response['displayName'],
"EXE_NAME": find_value(response, 'EXE_NAME'),
"EXE_PATH": find_value(response, 'EXE_PATH'),
}
sample_response = {
"entityId": "PROCESS_GROUP_INSTANCE-FB8C65551916D57D",
"displayName": "Windows System",
"firstSeenTms": 1619147697131,
"lastSeenTms": 1653404640000,
"properties": {
"detectedName": "Windows System",
"bitness": "32",
"metadata": [],
"awsNameTag": "Windows System",
"softwareTechnologies": [
{
"type": "WINDOWS_SYSTEM"
}
],
"processType": "WINDOWS_SYSTEM"
}
}
print(pd.json_normalize(get_values(sample_response)))
Sample output for metadata
being empty:
entityId displayName EXE_NAME EXE_PATH
0 PROCESS_GROUP_INSTANCE-FB8C65551916D57D Windows System
And one when metadata
carries, well, data:
entityId ... EXE_PATH
0 proc_1234 ... /usr/local/oracle/oem/agent*c/agent_*/oracle_c...
CodePudding user response:
Just check if metadata is empty if resp2.get("metadata") == []
. If it is then just do the following:
Procdf = pd.DataFrame(columns=["entityId", "detectedName", "EXE_NAME", "EXE_PATH"])
.
If it isn't then use your code.