I'm pretty new working with JSON files. I have two JSON Files that looks like this:
J1:
{'country': PA,
'year': '2021',
'month': '10',
'client': 'x',
'data': [{'id': 'g0084',
'lat': y,
'lon': q,
'avg_audience': 87.84,
'audiences': [{'day': '01', 'audience': 84},
{'day': '02', 'audience': 83},
{'day': '03', 'audience': 52},
...
{'id': 'g0093',
'lat': y,
'lon': q,
'avg_audience': 36.13,
'audiences': [{'day': '01', 'audience': 48},
{'day': '02', 'audience': 47},
{'day': '03', 'audience': 39},
J2:
[{'id': 'g0084', 'day': '01', 'audience': 190365.76367661008},
{'id': 'g0084', 'day': '02', 'audience': 186712.00627779233},
{'id': 'g0084', 'day': '03', 'audience': 161048.6891385768},
...
{'id': 'g0093', 'day': '01', 'audience': 108780.43638663433},
{'id': 'g0093', 'day': '02', 'audience': 105728.4854826053},
{'id': 'g0093', 'day': '03', 'audience': 120786.51685393258},
So I want to replace the audience value in J1 with the audience value in J2. Based on the same ID and day. My desired output looks like this:
{'country': PA,
'year': '2021',
'month': '10',
'client': 'x',
'data': [{'id': 'g0084',
'lat': y,
'lon': q,
'avg_audience': 87.84,
'audiences': [{'day': '01', 'audience': 190365.76367661008}},
{'day': '02', 'audience': 186712.00627779233},
{'day': '03', 'audience': 161048.6891385768},
...
{'id': 'g0093',
'lat': y,
'lon': q,
'avg_audience': 36.13,
'audiences': [{'day': '01', 'audience': 108780.4},
{'day': '02', 'audience': 105728.48},
{'day': '03', 'audience': 105728.48},
And I have no idea how to do this. Any help, please?
CodePudding user response:
Here's a way to do this, but I'd like to rant a bit first. It seems that more and more people are using JSON for what rightly should be stored in a database. This is one such example. JSON was designed for data interchange, not for long-term storage.
But even if you don't use a database, the data structure here is not correct. Why is 'data'
a list of objects, instead of being an object where 'id'
is the key? Same with audiences
. That should not be a list of records, it should be an object where 'day'
is the key and 'audience'
is the value. Those changes would have made this update trivial. Indeed, if you had many thousands of records, I would probably create a dictionary index of J1 before beginning the update.
J1 = {'country': 'PA',
'year': '2021',
'month': '10',
'client': 'x',
'data': [
{'id': 'g0084',
'lat': 123.456,
'lon': 123.456,
'avg_audience': 87.84,
'audiences': [{'day': '01', 'audience': 84},
{'day': '02', 'audience': 83},
{'day': '03', 'audience': 52},
]
},
{'id': 'g0093',
'lat': 123.456,
'lon': 123.456,
'avg_audience': 36.13,
'audiences': [{'day': '01', 'audience': 48},
{'day': '02', 'audience': 47},
{'day': '03', 'audience': 39}
]
}
]
}
J2 = [
{'id': 'g0084', 'day': '01', 'audience': 190365.76367661008},
{'id': 'g0084', 'day': '02', 'audience': 186712.00627779233},
{'id': 'g0084', 'day': '03', 'audience': 161048.6891385768},
{'id': 'g0093', 'day': '01', 'audience': 108780.43638663433},
{'id': 'g0093', 'day': '02', 'audience': 105728.4854826053},
{'id': 'g0093', 'day': '03', 'audience': 120786.51685393258}
]
# Make an index into J1['data].
idx = {}
for row in J1['data']:
idx[row['id']] = row
for row in J2:
# Find the id.
idx = [t for t in J1['data'] if t['id'] == row['id']][0]
# Find the day.
day = [t for t in idx['audiences'] if t['day'] == row['day']][0]
day['audience'] = row['audience']
from pprint import pprint
pprint(J1)
Output:
{'client': 'x',
'country': 'PA',
'data': [{'audiences': [{'audience': 190365.76367661008, 'day': '01'},
{'audience': 186712.00627779233, 'day': '02'},
{'audience': 161048.6891385768, 'day': '03'}],
'avg_audience': 87.84,
'id': 'g0084',
'lat': 123.456,
'lon': 123.456},
{'audiences': [{'audience': 108780.43638663433, 'day': '01'},
{'audience': 105728.4854826053, 'day': '02'},
{'audience': 120786.51685393258, 'day': '03'}],
'avg_audience': 36.13,
'id': 'g0093',
'lat': 123.456,
'lon': 123.456}],
'month': '10',
'year': '2021'}