I am working with Pandas dataframe and my data has Date, Events and Count as columns.
Date Events Count
1/1/2021 Event1 2
1/1/2021 Event2 1
2/1/2021 Event1 2
1/2/2021 Event1 1
2/1/2021 Event2 1
3/2/2021 Event3 2
3/2/2021 Event3 2
I want to groupby of events if it repeats in the same month and give default value as 1 in count column.
Date Events Count
1/1/2021 Event1 1
1/1/2021 Event2 1
1/2/2021 Event1 1
3/2/2021 Event3 1
CodePudding user response:
Use DataFrame.drop_duplicates
with set 1
to Count
column:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df['months'] = df['Date'].dt.to_period('m')
df1 = df.drop_duplicates(['months','Events']).assign(Count=1)
print (df1)
Date Events Count months
0 2021-01-01 Event1 1 2021-01
1 2021-01-01 Event2 1 2021-01
3 2021-02-01 Event1 1 2021-02
5 2021-02-03 Event3 1 2021-02
df1 = df1.drop('months', axis=1)
print (df1)
Date Events Count
0 2021-01-01 Event1 1
1 2021-01-01 Event2 1
3 2021-02-01 Event1 1
5 2021-02-03 Event3 1