Home > OS >  Python - Create plot with percentage of occurence
Python - Create plot with percentage of occurence

Time:11-27

I have a dataframe which contains orders. Each product has a color. I want to create a (line) plot of monthly data and show the occurrence of colors throughout the month.

A snippet of the current dataframe:

                         Color 
2021-08-25 17:43:30      Blue
2021-08-25 17:26:34      Blue
2021-08-25 17:15:51      Green
2021-09-02 14:23:19      Blue
2021-09-04 18:11:17      Yellow

I thought I needed to create an extra column with the percentage of occurrence throughout the month first. I tried using:

df.groupby(['Color']).Color.agg([('Color_count', 'count')]).reset_index()

Which gave me:

           Color       Color_count
0          Blue        2
1          Green       1

The desired output should give me columns with all the colors and the percentage of occurrence per month, something like:

                         Blue       Green      Yellow     
2021-08-31               0.73       0.24       0.00
2021-09-30               0.66       0.29       0.01

With those percentages I can make a plot to show monthly data of the colors.

Thank you in advance.

CodePudding user response:

Use Grouper with SeriesGroupBy.value_counts and Series.unstack:

df1 = (df.groupby(pd.Grouper(freq='M'))['Color']
         .value_counts(normalize=True)
         .unstack(fill_value=0)
         .rename_axis(None, axis=1))
print (df1)
                Blue     Green  Yellow
2021-08-31  0.666667  0.333333     0.0
2021-09-30  0.500000  0.000000     0.5
  • Related