Home > front end >  Combining multiple timestamps to a single day pandas
Combining multiple timestamps to a single day pandas

Time:10-02

I want to combine multiple datestamps (datetime64) to a single row representing one day. Then I want to sum up the amount in the last column getting the total sales per day.

Sample of the dataframe

In this case I want to have two lines, with the two days and the total sales. I have tried to solve my problem with the groupby operation, but it won't work.

CodePudding user response:

You could try to use resample

df_1d=df.resample('1d', on='timestamp').sum()

It will sum all data for all day or from another time

CodePudding user response:

The one-liner df.resample('1d', on='timestamp').sum() from Aeroxer Support is perfect, but it does not explain why your attempts with groupby failed.

In order to groupby to work, you would need a column with just the day in it. Then you could groupby by that day column.

Below is the example code. I add the extra column with just the day in it at In [4] and then df.groupby('day').sum() is what you are looking for.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({
   ...:     'timestamp': map(pd.Timestamp, ['2022-09-30 11:21', '2022-09-30 20:55', '2022-10-01 10:35', '2022-10-01 22:42']),
   ...:     'sales': [99.90, 10.20, 5.99, 21.00]
   ...: })

In [3]: df
Out[3]: 
            timestamp  sales
0 2022-09-30 11:21:00  99.90
1 2022-09-30 20:55:00  10.20
2 2022-10-01 10:35:00   5.99
3 2022-10-01 22:42:00  21.00

In [4]: df['day'] = df.timestamp.apply(pd.Timestamp.date)

In [5]: df
Out[5]: 
            timestamp  sales         day
0 2022-09-30 11:21:00  99.90  2022-09-30
1 2022-09-30 20:55:00  10.20  2022-09-30
2 2022-10-01 10:35:00   5.99  2022-10-01
3 2022-10-01 22:42:00  21.00  2022-10-01

In [6]: df.groupby('day').sum()
Out[6]: 
             sales
day               
2022-09-30  110.10
2022-10-01   26.99
  • Related