Home > Software design >  How to generate all dates in a month based on timestamp column using pandas dataframe
How to generate all dates in a month based on timestamp column using pandas dataframe

Time:08-29

Am using pandas dataframe having columns ID ,Date ,Value. Groupby of ID Column ,Based on timestamp column I want to generate all dates in a new columns considering MONTH.

If any of the month has 30days for that in a new column 30 rows should be created in a dd-mm-yyyy format

If any of the month has 31days for that in a new column 31 rows should be created in a dd-mm-yyyy format

If any of the month has 28days for that in a new column 28 rows should be created in a dd-mm-yyyy format

Input Dataframe   
        Id       timestamp      Value
       1002    18-01-2021   45
       1003    08-03-2021   14
       1004    03-07-2021   50
       1005    18-06-2021   23
       1005    12-09-2021   108

Output Dataframe

       Id       timestamp      Value   ext_dtes
       1002    18-01-2021   45    01-01-2021
                                  02-01-2021
                                  03-01-2021
                                  04-01-2021
                                  05-01-2021
                                      .
                                      .
                                  31-01-2021
                                    
       1003    18-03-2021   14    01-03-2021
                                  02-03-2021
                                  03-03-2021
                                  04-03-2021
                                  05-03-2021
                                      .
                                      .
                                  31-03-2021

       1004    03-07-2021   50    01-07-2021
                                  02-07-2021
                                  03-07-2021
                                  04-07-2021
                                  05-07-2021
                                      .
                                      .
                                  31-07-2021

       1005    18-06-2021   23    01-06-2021
                                  02-06-2021
                                  03-06-2021
                                  04-06-2021
                                  05-06-2021
                                      .
                                      .
                                  30-06-2021

       1005    12-09-2021   45    01-09-2021
                                  02-09-2021
                                  03-09-2021
                                  04-09-2021
                                  05-09-2021
                                      .
                                      .
                                  30-09-2021

I have tried this

date_format = [datetime.strptime(i, '%d-%m-%Y') for i in df['timestamp']]
num_days = [calendar.monthrange(i.year,i.month,i.date)[2] for i in date_format]
num_days
df['ext_dates']= df.groupby(pd.PeriodIndex(data= num_days, freq='D'))

import pandas as pd

#month = '2018-08-01'
df_a['Timestamp'] = pd.to_datetime(df_a['Timestamp'])
month =df_a['Timestamp']
month
df = pd.DataFrame({
    'all_dates': pd.date_range(
        start = pd.Timestamp(month),                        
        end = pd.Timestamp(month)   pd.offsets.MonthEnd(0),  # <-- 2018-08-31 with MonthEnd
        freq = 'D'
    )
})


error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-98f03ab5c9bb> in <module>
      7 df = pd.DataFrame({
      8     'all_dates': pd.date_range(
----> 9         start = pd.Timestamp(month),
     10         end = pd.Timestamp(month)   pd.offsets.MonthEnd(0),  # <-- 2018-08-31 with MonthEnd
     11         freq = 'D'

pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()

TypeError: Cannot convert input [0   2022-01-01
1   2022-02-01
2   2022-03-01
Name: YearMonth, dtype: datetime64[ns]] of type <class 'pandas.core.series.Series'> to Timestamp

CodePudding user response:

import pandas as pd
import calendar, datetime

df = pd.DataFrame({'Id':[1002,1003], 'timestamp':['18-01-2021','08-03-2021'],'Value':[45,14]})
df['timestamp'] = pd.to_datetime(df['timestamp'])

def givedays(f):
    """A function to return all the days in the month of a given date"""
    year = f['timestamp'].year
    month = f['timestamp'].month
    num_days = calendar.monthrange(year, month)[1] #https://stackoverflow.com/questions/21231789/how-to-get-all-days-in-current-month
    days = [datetime.date(year, month, day) for day in range(1, num_days 1)]
    return days

df['days'] = df.apply(lambda x: givedays(x), axis=1)
df = df.explode(column='days')
  • Related