Hi Am am having trouble grouping and graphing/plotting data with.
signed last_seen mon
0 Signed 2022-09-07 00:18:14.149000 00:00 9
1 Signed 2022-09-01 06:19:11.571000 00:00 9
2 Signed 2022-09-07 00:18:14.142000 00:00 9
3 Signed 2022-09-07 00:43:10.486000 00:00 9
4 Signed 2022-09-07 01:02:54.170000 00:00 9
... ... ... ...
995 Unsigned 2022-08-31 01:37:44.696000 00:00 8
996 Unsigned 2022-08-31 01:37:44.682000 00:00 8
997 Unsigned 2022-08-31 01:37:44.683000 00:00 8
998 Unsigned 2022-08-31 02:48:17.310000 00:00 8
999 Unsigned 2022-09-06 09:29:33.890000 00:00 9
1000 rows × 3 columns
I get this using the following code:
import pandas as pd
import json
import matplotlib.pyplot as plt
f = open('binary.json', "r")
data = json.loads(f.read())
data
df = pd.json_normalize(data['results'])
drop_list = ["last_seen","signed"]
df = df.drop(df.columns.difference(drop_list), axis=1)
df.last_seen = pd.to_datetime(df.last_seen)
df['mon'] = df.last_seen.dt.month
I now cannot work out how to group by month and signed status.
I then need to plot this on a stacked bar chart. The bars will hold signed v unsigned grouped by month.
Can anyone please help me out??
CodePudding user response:
Group the data by month
and "unstack" signed column into columns for each label:
df_grp = df.groupby(['mon', 'signed'])['mon'].count().unstack('signed')
>> signed Signed Unsigned
>> mon
>> 8 2 4
>> 9 3 1
Plot stacked bar chart using pandas plot API:
df_grp[['Signed','Unsigned']].plot(kind='bar', stacked=True)