I have a dataframe:
df = Batch_ID DateTime Code A1 A2
ABC. '2019-01-02 17:03:41.000' 230 2. 4
ABC. '2019-01-02 17:03:42.000' 231 1. 4
ABC. '2019-01-02 17:03:48.000' 232 2. 7
ABC. '2019-01-02 17:04:41.000' 230 2. 9
ABB. '2019-01-02 17:04:41.000' 235 5. 4
ABB. '2019-01-02 17:04:45.000' 236 2. 0
I need to generate an plot of an histogram of "number of different codes per <Batch_ID, minute> So in this case some entries will be:
<ABC, 2019-01-02 17:03> : 3
<ABC, 2019-01-02 17:04> : 1
<ABB, 2019-01-02 17:04> : 2
How can it be done? Thanks!
CodePudding user response:
Try this using pd.Grouper
on a datetime dtype column:
df = pd.read_clipboard(sep='\s\s ')
df['DateTime'] = pd.to_datetime(df['DateTime'].str.strip("'"))
df.groupby(['Batch_ID', pd.Grouper(key='DateTime', freq='T')])['Code'].count().rename('Count').reset_index()
Output:
Batch_ID DateTime Count
0 ABB. 2019-01-02 17:04:00 2
1 ABC. 2019-01-02 17:03:00 3
2 ABC. 2019-01-02 17:04:00 1