I have this dataframe:
Month_Year City_Name Chain_Name Product_Name Product_Price
11-2021 London Aldi Pasta 2.33
11-2021 Bristol Spar Bananas 1.45
10-2021 London Tesco Olives 4.12
10-2021 Cardiff Spar Pasta 2.25
This dataframe will be displayed in nested collapsible which will expand in the following order:
Date_month:
City_Name:
Shop_name:
Product_name : Price
Theefore, I want to transform the dataframe to the following structure:
{10-2021:
{London:
{Aldi:
{Pasta:2.33}}},
{Bristol:{
{Spar:
{Bananas:1.45}}}}
Essentially I want to groupby
every element recursively, starting by purchase month. The closest thing I've found is the solution in this answer, but as dictionary doesn't take duplicate values as I might have duplicate values in every column. Other than that, I am not sure if using a dictionary is the optimal answer for this problem.
My best guess is I have to use other DS type such as some type of tree, but couldn't find any way to do that.
Any advises?
CodePudding user response:
You can group your dataframe by all columns except price, then create your dictionaries in a loop:
# if more than one price for one product in a chain, then calculate mean:
grouped_df = df.groupby(['Month_Year', 'City_Name', 'Chain_Name', 'Product_Name']).agg('mean')
result = dict()
nested_dict = dict()
for index, value in grouped_df.itertuples():
for i, key in enumerate(index):
if i == 0:
if not key in result:
result[key] = {}
nested_dict = result[key]
elif i == len(index) - 1:
nested_dict[key] = value
else:
if not key in nested_dict:
nested_dict[key] = {}
nested_dict = nested_dict[key]
print(json.dumps(result, indent=4))
Changing your df to show nested dict and mean calculation to:
Month_Year City_Name Chain_Name Product_Name Product_Price
0 11-2021 London Aldi Pasta 2.33
1 11-2021 London Aldi Pasta 2.35
2 11-2021 London Aldi Olives 3.99
3 11-2021 Bristol Spar Bananas 1.45
4 10-2021 London Tesco Olives 4.12
5 10-2021 Cardiff Spar Pasta 2.25
You get the output:
{
"10-2021": {
"Cardiff": {
"Spar": {
"Pasta": 2.25
}
},
"London": {
"Tesco": {
"Olives": 4.12
}
}
},
"11-2021": {
"Bristol": {
"Spar": {
"Bananas": 1.45
}
},
"London": {
"Aldi": {
"Olives": 3.99,
"Pasta": 2.34
}
}
}
}