I am practicing data analytics and I am stuck in one problem.
I group the dataframe by Date Purchased and set it to unique because I want to count the unique value for each date purchased.
training.groupby('DATE PURCHASED')['Account - Store Name'].unique().to_frame()
So it looks like this GROUPBY DATE PURCHASED
Now that the data has been aggregated, I want to count the items in that column, so I used.split(',').
training_groupby['Account - Store Name'].apply(lambda x: x.split(','))
but I got error
AttributeError: 'numpy.ndarray' object has no attribute 'split'
Can someone help me, on how to count the number of unique values per Date Purchased. I've been trying to solve this for almost a week now. I tried to search on Youtube and Google it. But I can't find anything that will help me.
CodePudding user response:
I think this is what you want?
training_groupby["Total Purchased"] = training_groupby["Account - Store Name"].apply(lambda x: len(set(x)))
CodePudding user response:
You can do multiple aggregations in the same pandas.DataFrame.groupby
clause :
Try this :
out = (training
.groupby(['DATE PURCHASED'])
.agg(**{
'Account - Store Name': ('Account - Store Name', 'unique'),
'Items Count': ('Account - Store Name', 'nunique'),
})
)
# Output :
print(out)
Account - Store Name Items Count
DATE PURCHASED
13/01/2022 [Landmark Makati, Landmark Nuvali] 2
14/01/2022 [Landmark Nuvali] 1
15/01/2022 [Robinsons Dolores, Landmark Nuvali] 2
16/01/2022 [Robinsons Ilocos Norte, Landmarj Trinoma] 2
19/01/2022 [Shopwise Alabang] 1