I am trying to grab data from JSON file based on what quarter the dates represent. My goal is to assign the data to a variable so I should have Q1, Q2, Q3, Q4 variables holding the data inside. Below is the JSON:
{
"lastDate":{
"0":"2022Q4",
"1":"2022Q4",
"2":"2022Q4",
"7":"2022Q4",
"8":"2022Q4",
"9":"2022Q4",
"18":"2022Q3",
"19":"2022Q3",
"22":"2022Q3",
"24":"2022Q2"
},
"transactionType":{
"0":"Sell",
"1":"Automatic Sell",
"2":"Automatic Sell",
"7":"Automatic Sell",
"8":"Sell",
"9":"Automatic Sell",
"18":"Automatic Sell",
"19":"Automatic Sell",
"22":"Automatic Sell",
"24":"Automatic Sell"
},
"sharesTraded":{
"0":"20,200",
"1":"176,299",
"2":"8,053",
"7":"167,889",
"8":"13,250",
"9":"176,299",
"18":"96,735",
"19":"15,366",
"22":"25,000",
"24":"25,000"
}
}
Now if i try to use the following code:
import json
data = json.load(open("AAPL22data.json"))
Q2data = [item for item in data if '2022Q2' in data['lastDate']]
print(Q2data)
My ideal output should be:
{
"lastDate":{
"24":"2022Q2"
},
"transactionType":{
"24":"Automatic Sell"
},
"sharesTraded":{
"24":"25,000"
}
}
And then repeat the same structure for the other quarters. However, my current output gives me "[ ]"
CodePudding user response:
With pandas you can read this nested dictionary a transform it to a table representation. Then the aggregation you are required becomes quite natural.
import pandas as pd
sample_dict = {
"lastDate":{
"0":"2022Q4",
"1":"2022Q4",
"2":"2022Q4",
"7":"2022Q4",
"8":"2022Q4",
"9":"2022Q4",
"18":"2022Q3",
"19":"2022Q3",
"22":"2022Q3",
"24":"2022Q2"
},
"transactionType":{
"0":"Sell",
"1":"Automatic Sell",
"2":"Automatic Sell",
"7":"Automatic Sell",
"8":"Sell",
"9":"Automatic Sell",
"18":"Automatic Sell",
"19":"Automatic Sell",
"22":"Automatic Sell",
"24":"Automatic Sell"
},
"sharesTraded":{
"0":"20,200",
"1":"176,299",
"2":"8,053",
"7":"167,889",
"8":"13,250",
"9":"176,299",
"18":"96,735",
"19":"15,366",
"22":"25,000",
"24":"25,000"
}
}
print(pd.DataFrame.from_dict(sample_dict))
returns
Output:
lastDate transactionType sharesTraded
0 2022Q4 Sell 20,200
1 2022Q4 Automatic Sell 176,299
2 2022Q4 Automatic Sell 8,053
7 2022Q4 Automatic Sell 167,889
8 2022Q4 Sell 13,250
9 2022Q4 Automatic Sell 176,299
18 2022Q3 Automatic Sell 96,735
19 2022Q3 Automatic Sell 15,366
22 2022Q3 Automatic Sell 25,000
24 2022Q2 Automatic Sell 25,000
then a simple group_by should do the trick.
CodePudding user response:
Use a dictionary comprehension:
import json
my_json = """{
"lastDate":{
"0":"2022Q4",
"1":"2022Q4",
"2":"2022Q4",
"7":"2022Q4",
"8":"2022Q4",
"9":"2022Q4",
"18":"2022Q3",
"19":"2022Q3",
"22":"2022Q3",
"24":"2022Q2"
},
"transactionType":{
"0":"Sell",
"1":"Automatic Sell",
"2":"Automatic Sell",
"7":"Automatic Sell",
"8":"Sell",
"9":"Automatic Sell",
"18":"Automatic Sell",
"19":"Automatic Sell",
"22":"Automatic Sell",
"24":"Automatic Sell"
},
"sharesTraded":{
"0":"20,200",
"1":"176,299",
"2":"8,053",
"7":"167,889",
"8":"13,250",
"9":"176,299",
"18":"96,735",
"19":"15,366",
"22":"25,000",
"24":"25,000"
}
}"""
data = json.loads(my_json)
var = "24"
data = {k:{var: v[var]} for k, v in data.items()}
data = json.dumps(data, indent = 2)
print(data)
Output:
{
"lastDate": {
"24": "2022Q2"
},
"transactionType": {
"24": "Automatic Sell"
},
"sharesTraded": {
"24": "25,000"
}
}