I have a JSON File in this format.
I want to preprocess this data. For that, I converted these data to CSV using a pip module called jsoncsv. How do I read this JSON file from pandas without converting this to CSV using an external pip library? I looked at this page on the pandas docs but I think this json structure does not belong to any of the 5 orient parameter types.
{"ip":"10.1.1.1","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.1.1.1:22 i/o timeout"}}}
{"ip":"10.12.4.7","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.12.4.7:22: i/o timeout"}}}
{"ip":"10.21.5.5","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.21.5.5:22 i/o timeout"}}}
{"ip":"10.12.6.2","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.12.6.2:22: i/o timeout"}}}
{"ip":"10.21.6.4","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.21.6.4:22: i/o timeout"}}}
CodePudding user response:
Try with pandas.json_normalize
:
import json
d = json.loads(open("test.json").read())
df = pd.json_normalize(d)
>>> df
ip data.ssh.status data.ssh.protocol data.ssh.error
0 10.1.1.1 connection-timeout ssh dial tcp 10.1.1.1:22 i/o timeout
1 10.12.4.7 connection-timeout ssh dial tcp 10.12.4.7:22: i/o timeout
2 10.21.5.5 connection-timeout ssh dial tcp 10.21.5.5:22 i/o timeout
3 10.12.6.2 connection-timeout ssh dial tcp 10.12.6.2:22: i/o timeout
4 10.21.6.4 connection-timeout ssh dial tcp 10.21.6.4:22: i/o timeout
test.json:
[{"ip":"10.1.1.1","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.1.1.1:22 i/o timeout"}}},
{"ip":"10.12.4.7","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.12.4.7:22: i/o timeout"}}},
{"ip":"10.21.5.5","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.21.5.5:22 i/o timeout"}}},
{"ip":"10.12.6.2","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.12.6.2:22: i/o timeout"}}},
{"ip":"10.21.6.4","data":{"ssh":{"status":"connection-timeout","protocol":"ssh","error":"dial tcp 10.21.6.4:22: i/o timeout"}}}]