I'm trying to filter following .json file example where ["cbaCode"]['HHH'] differs from '300':
{
"took" : 32,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1549,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "ib-prodfulltext-t24-transhist-202211",
"_type" : "_doc",
"_id" : "D7JGOQTS2XPVSG6HN",
"_score" : null,
"_source" : {
"accountNbr" : 6900069,
"accountNbrText" : "6900069",
"acctApplNbr" : "02",
"acknowledgementDate" : "2022-11-01T01:46:38.000 01:00",
"acknowledgementDateText" : "2022-11-01",
"avoType" : "ADI",
"bankCode" : "0100",
"bankingCore" : "T24",
"bazenType" : "ADI",
"businessDate" : "2022-11-01",
"cbaCode" : "10000101002",
"cbaCodeParts" : {
"BBB" : "002",
"HHH" : "100",
"TT" : "01",
"VVV" : "001"
},
"chargeType" : "SHAR",
"creditDebitIndicator" : "D",
"currencyCode" : "CZK",
...
I have tried:
import json
with open('2022-10.json', 'r') as f:
input_dict = json.load(f)
output_dict = [x for x in input_dict if not x['HHH'] == "300"]
output_json = json.dumps(output_dict)
print(output_json)
...which raises:
TypeError: string indices must be integers
I think I'm missing quering the json file by the 5th level but kind of lost in the structure.
Help would be appreciated.
CodePudding user response:
You have to provide the exact path inside your Json tree:
import json
with open('2022-10.json', 'r') as f:
input_dict = json.load(f)
output_dict = [x for x in input_dict["hits"]["hits"]
if not x["_source"]["cbaCodeParts"]["HHH"] == "300"
]
output_json = json.dumps(output_dict)
print(output_json)
Accessing just the top level data, which is a dictionary will get you just the list of that dictionary and not the array you might want. The above code assumes you want to cycle through the array provided by the second hits
-field.
Also providing more of your Json file might help people understand your problem even more.