Suppose I have list of dicts:
my lists = [
{'rank': 2, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B01MG0ORBL'
},
{'rank': 18, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 21, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 126, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B01MG0ORBL'
},
{'rank': 128, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B0735C9RDZ'
},
{'rank': 136, 'keyword_name': 'mens wallet', 'volume': 456677, 'asin': 'B07FPVR858'
},
{'rank': 19, 'keyword_name': 'leather wallets', 'volume': 23, 'asin': 'B0735C9RDZ'
},
{'rank': 10, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B07FPVR858'
},
{'rank': 16, 'keyword_name': 'wallets for men', 'volume': 566, 'asin': 'B0735C9RDZ'
},
]
I want to group by asin and keyword_name since they appear more than once in the list of dicts, so my goal is to have a dataframe that looks like this:
**keyword_name volume B01MG0ORBL B0735C9RDZ B07FPVR858** // column headers
mens wallet 456677 2 126 18 128 19 16 21 10
leather wallets 23
wallets for men 566 16 10
So initially I am thinking of
d = [{d['asin']:d['rank'] for d in l} for l in my_lists]
pd.dataframe(d)
// save as xlsx file
writer = pd.ExcelWriter(f"{path}/sheet.xlsx", engine="xlsxwriter")
d.to_excel(
writer, sheet_name="Organic", startrow=0, header=True, index=False
)
But not possible since it will ran into errors TypeError: string indices must be integers
.
CodePudding user response:
You can create DataFrame
and then pivoting with list
s:
df = pd.DataFrame(my_lists)
df = df.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=list)
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 NaN [19] NaN
mens wallet 456677 [2, 126] [18, 128] [21, 136]
wallets for men 566 NaN [16] [10]
Or join converted values to strings:
df = pd.DataFrame(my_lists)
df = (df.assign(rank=df['rank'].astype(str))
.pivot_table(index=['keyword_name','volume'],
columns='asin',
values='rank',
aggfunc=' '.join,
fill_value=''))
print (df)
asin B01MG0ORBL B0735C9RDZ B07FPVR858
keyword_name volume
leather wallets 23 19
mens wallet 456677 2 126 18 128 21 136
wallets for men 566 16 10