Home > Software engineering >  Assign data in JSON file to a variable based on condition python
Assign data in JSON file to a variable based on condition python

Time:12-03

I am trying to grab data from JSON file based on what quarter the dates represent. My goal is to assign the data to a variable so I should have Q1, Q2, Q3, Q4 variables holding the data inside. Below is the JSON:

{
    "lastDate":{
        "0":"2022Q4",
        "1":"2022Q4",
        "2":"2022Q4",
        "7":"2022Q4",
        "8":"2022Q4",
        "9":"2022Q4",
        "18":"2022Q3",
        "19":"2022Q3",
        "22":"2022Q3",
        "24":"2022Q2"
    },
    "transactionType":{
        "0":"Sell",
        "1":"Automatic Sell",
        "2":"Automatic Sell",
        "7":"Automatic Sell",
        "8":"Sell",
        "9":"Automatic Sell",
        "18":"Automatic Sell",
        "19":"Automatic Sell",
        "22":"Automatic Sell",
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "0":"20,200",
        "1":"176,299",
        "2":"8,053",
        "7":"167,889",
        "8":"13,250",
        "9":"176,299",
        "18":"96,735",
        "19":"15,366",
        "22":"25,000",
        "24":"25,000"
    }
}

Now if i try to use the following code:

import json

data = json.load(open("AAPL22data.json"))

Q2data = [item for item in data if '2022Q2' in data['lastDate']]
print(Q2data)

My ideal output should be:

{
    "lastDate":{
        "24":"2022Q2"
    },
    "transactionType":{
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "24":"25,000"
    }
}

And then repeat the same structure for the other quarters. However, my current output gives me "[ ]"

CodePudding user response:

With pandas you can read this nested dictionary a transform it to a table representation. Then the aggregation you are required becomes quite natural.

import pandas as pd 

sample_dict = {
    "lastDate":{
        "0":"2022Q4",
        "1":"2022Q4",
        "2":"2022Q4",
        "7":"2022Q4",
        "8":"2022Q4",
        "9":"2022Q4",
        "18":"2022Q3",
        "19":"2022Q3",
        "22":"2022Q3",
        "24":"2022Q2"
    },
    "transactionType":{
        "0":"Sell",
        "1":"Automatic Sell",
        "2":"Automatic Sell",
        "7":"Automatic Sell",
        "8":"Sell",
        "9":"Automatic Sell",
        "18":"Automatic Sell",
        "19":"Automatic Sell",
        "22":"Automatic Sell",
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "0":"20,200",
        "1":"176,299",
        "2":"8,053",
        "7":"167,889",
        "8":"13,250",
        "9":"176,299",
        "18":"96,735",
        "19":"15,366",
        "22":"25,000",
        "24":"25,000"
    }
}

print(pd.DataFrame.from_dict(sample_dict))

returns

Output:

   lastDate transactionType sharesTraded
0    2022Q4            Sell       20,200
1    2022Q4  Automatic Sell      176,299
2    2022Q4  Automatic Sell        8,053
7    2022Q4  Automatic Sell      167,889
8    2022Q4            Sell       13,250
9    2022Q4  Automatic Sell      176,299
18   2022Q3  Automatic Sell       96,735
19   2022Q3  Automatic Sell       15,366
22   2022Q3  Automatic Sell       25,000
24   2022Q2  Automatic Sell       25,000

then a simple group_by should do the trick.

CodePudding user response:

Use a dictionary comprehension:

import json

my_json = """{
    "lastDate":{
        "0":"2022Q4",
        "1":"2022Q4",
        "2":"2022Q4",
        "7":"2022Q4",
        "8":"2022Q4",
        "9":"2022Q4",
        "18":"2022Q3",
        "19":"2022Q3",
        "22":"2022Q3",
        "24":"2022Q2"
    },
    "transactionType":{
        "0":"Sell",
        "1":"Automatic Sell",
        "2":"Automatic Sell",
        "7":"Automatic Sell",
        "8":"Sell",
        "9":"Automatic Sell",
        "18":"Automatic Sell",
        "19":"Automatic Sell",
        "22":"Automatic Sell",
        "24":"Automatic Sell"
    },
    "sharesTraded":{
        "0":"20,200",
        "1":"176,299",
        "2":"8,053",
        "7":"167,889",
        "8":"13,250",
        "9":"176,299",
        "18":"96,735",
        "19":"15,366",
        "22":"25,000",
        "24":"25,000"
    }
}"""
data = json.loads(my_json)

var = "24"

data = {k:{var: v[var]} for k, v in data.items()}
data = json.dumps(data, indent = 2)

print(data)

Output:

{
  "lastDate": {
    "24": "2022Q2"
  },
  "transactionType": {
    "24": "Automatic Sell"
  },
  "sharesTraded": {
    "24": "25,000"
  }
}
  • Related