Home > Blockchain >  Grouping and Plotting Pandas Dataframe
Grouping and Plotting Pandas Dataframe

Time:09-09

Hi Am am having trouble grouping and graphing/plotting data with.

    signed  last_seen                           mon
0   Signed  2022-09-07 00:18:14.149000 00:00    9
1   Signed  2022-09-01 06:19:11.571000 00:00    9
2   Signed  2022-09-07 00:18:14.142000 00:00    9
3   Signed  2022-09-07 00:43:10.486000 00:00    9
4   Signed  2022-09-07 01:02:54.170000 00:00    9
... ... ... ...
995 Unsigned    2022-08-31 01:37:44.696000 00:00    8
996 Unsigned    2022-08-31 01:37:44.682000 00:00    8
997 Unsigned    2022-08-31 01:37:44.683000 00:00    8
998 Unsigned    2022-08-31 02:48:17.310000 00:00    8
999 Unsigned    2022-09-06 09:29:33.890000 00:00    9
1000 rows × 3 columns

I get this using the following code:

import pandas as pd
import json
import matplotlib.pyplot as plt
f = open('binary.json', "r")
data = json.loads(f.read())
data
df = pd.json_normalize(data['results'])
drop_list = ["last_seen","signed"]
df = df.drop(df.columns.difference(drop_list), axis=1)
df.last_seen = pd.to_datetime(df.last_seen)
df['mon'] = df.last_seen.dt.month

I now cannot work out how to group by month and signed status.

I then need to plot this on a stacked bar chart. The bars will hold signed v unsigned grouped by month.

Can anyone please help me out??

CodePudding user response:

Group the data by month and "unstack" signed column into columns for each label:

df_grp = df.groupby(['mon', 'signed'])['mon'].count().unstack('signed')

>>  signed  Signed  Unsigned
>>  mon                     
>>  8            2         4
>>  9            3         1

Plot stacked bar chart using pandas plot API:

df_grp[['Signed','Unsigned']].plot(kind='bar', stacked=True)

enter image description here

  • Related