I have a list of dictionaries that look like the following:
[{
"id": "42f409d0-2cef-49b0-a027-59ed571cc2a9",
"cost": 0.868422,
"environment": "nonprod"
},
{
"id": "42f409d0-2cef-49b0-a027-59ed571cc2a9",
"cost": 0.017017,
"environment": "prod"
},
{
"id": "aa385029-afa6-4f1a-a1d9-d88b7d934699",
"cost": 0.010304,
"environment": "prod"
},
{
"id": "b13a0676-6926-49db-808c-3c968a9278eb",
"cost": 2.870336,
"environment": "nonprod"
},
{
"id": "b13a0676-6926-49db-808c-3c968a9278eb",
"cost": 0.00455,
"environment": "prod"
},
{
"id": "b13a0676-6926-49db-808c-3c968a9278eb",
"cost": 0.032458,
"environment": "prod"
}]
The part im having a hard time understanding is how can I group these by id and environment and add up there costs for the given environment.
End result should be, only having a single pair of prod and nonprod for a given ID and have all costs for prod or nonprod added up under a single id in prod or a single id in nonprod.
I hope this is enough detail, thank you!
CodePudding user response:
d = {}
for el in data:
d[(el["id"], el["environment"])] = d.get((el["id"], el["environment"]), 0) el["cost"]
d
# {('42f409d0-2cef-49b0-a027-59ed571cc2a9', 'nonprod'): 0.868422,
# ('42f409d0-2cef-49b0-a027-59ed571cc2a9', 'prod'): 0.017017,
# ('aa385029-afa6-4f1a-a1d9-d88b7d934699', 'prod'): 0.010304,
# ('b13a0676-6926-49db-808c-3c968a9278eb', 'nonprod'): 2.870336,
# ('b13a0676-6926-49db-808c-3c968a9278eb', 'prod'): 0.037008}
CodePudding user response:
Try placing the dict values into a pandas dataframe, then use pandas's groupby function (I set the variable dicts
to equal your above list of dictionaries):
import pandas as pd
df = pd.DataFrame(dicts)
df.groupby(["id", "environment"], as_index=False).sum()
Output:
id environment cost
0 42f409d0-2cef-49b0-a027-59ed571cc2a9 nonprod 0.868422
1 42f409d0-2cef-49b0-a027-59ed571cc2a9 prod 0.017017
2 aa385029-afa6-4f1a-a1d9-d88b7d934699 prod 0.010304
3 b13a0676-6926-49db-808c-3c968a9278eb nonprod 2.870336
4 b13a0676-6926-49db-808c-3c968a9278eb prod 0.037008
CodePudding user response:
In addition of above answers, a solution of python by creating a unique key of each transaction id|type
will also work. This piece of code does exactly that and can even be made easier to read with defaultdict
.
dictCounter = dict()
#assuming test is the list of dicts
for eachEntry in test:
newUniqueKey = eachEntry["id"] "|" eachEntry["environment"]
if newUniqueKey not in dictCounter.keys():
dictCounter[newUniqueKey]=eachEntry["cost"]
else:
dictCounter[newUniqueKey] =eachEntry["cost"]