I'm looking for your advice, I have a CSV file with data:
I'm trying to find all unique keywords in 'keyword' columns and add them to a nested dict e.g.
data = {
"sony a5000": {"7/25/22": "422,484", "8/25/22":"405,284", "9/25/22": "402,210" ....,
"regalo para hombre": {"7/25/22": "422,485", "8/25/22": "382,485", "9/25/22": "327,321" ....,
I have used duplicated() method to find all duplicate keywords in a column and add them to a variable (res = df[df.duplicated('keyword')])
but, I can't manage to add the items into a DICT, I tried for loop but without results.
can you please give me a clue what function or method I should use in pandas to convert the data into dict.
CodePudding user response:
Sample data:
df = pd.DataFrame({
"keyword": ["sony", "sony", "maroon", "maroon"],
"searches": ["422,484", "422,485", "422,486", "422,487"],
"date": ["7/5/22", "7/5/23", "7/5/24", "7/5/25"]
})
split the data
def extract_values(group):
d = {}
for index, row in group.iterrows():
d[row["date"]] = row["searches"]
return {row["keyword"] : d}
result = df.groupby("keyword").apply(lambda x: extract_values(x))
result.tolist()
sample output
[{'maroon': {'7/5/24': '422,486', '7/5/25': '422,487'}},
{'sony': {'7/5/22': '422,484', '7/5/23': '422,485'}}]
CodePudding user response:
You can use dictionary comprehension for that:
{g[0]:{k:v for k, v in zip(g[1].date, g[1].searches)} for g in df.groupby('keyword')}
Input:
df = pd.DataFrame({"keyword": ["sony", "sony", "maroon", "maroon"], "searches": ["422,484", "422,485", "422,486", "422,487"], "date": ["7/5/22", "7/5/23", "7/5/24", "7/5/25"]})
Output:
{'maroon': {'7/5/24': '422,486', '7/5/25': '422,487'},
'sony': {'7/5/22': '422,484', '7/5/23': '422,485'}}