Home > OS >  How to change year and month in a date which is in datetime format?
How to change year and month in a date which is in datetime format?

Time:04-16

I have a Python dataframe (8000 rows) with a datetime format column, which has dates like YYYY-MM-DD. I am looking to change it from being a single date to multiple months, and years, with same day.

My Output:

0  data-1  2011-12-03
1  data-2  2011-12-03
2  data-3  2011-12-03
..
..
   data-4  2011-12-03
   data-5  2011-12-03
7999  data-6  2011-12-03

Expected output:

     val1        date
0  data-1  2009-01-03
..         2009-02-03
           2009-03-03
           ..
           ..
           2009-11-03
           2009-12-03
11  data-n 2010-01-03
..         2010-02-03
           2010-03-03
           ..
           ..
           2010-11-03
           2010-12-03
           2011-01-03
..         2011-02-03
           2011-03-03
           ..
           ..
           2011-11-03
7999  data-m  2011-12-03

I want it to spread over the 12 months and 5 years. I tried:

df.date[0:999] = pd.to_datetime(stallion_df.date[0:999])   pd.offsets.DateOffset(years=1)
df.date[0:999] = pd.to_datetime(stallion_df.date[0:999])   pd.offsets.DateOffset(months=3)
... 

for 8000 rows for year and month, which is clearly not optimal. Any help would be much appreciated. Thanks

CodePudding user response:

Is this your expected output? It's constructing the Cartesian product of dates ranging from '2018-01-03' to '2022-12-03' and val column. You have date-1 to date-m, so I substituted m with 100. Then you'll get 6000 rows.

m = 100
out = (pd.MultiIndex.from_product([[f'date-{i}' for i in range(1,m 1)], 
                            pd.date_range('2018-01-01','2022-12-01', freq='MS')   pd.DateOffset(days=2)])
       .to_frame(name=['val','date']).reset_index(drop=True))

Output:

           val       date
0       date-1 2018-01-03
1       date-1 2018-02-03
2       date-1 2018-03-03
3       date-1 2018-04-03
4       date-1 2018-05-03
...        ...        ...
5995  date-100 2022-08-03
5996  date-100 2022-09-03
5997  date-100 2022-10-03
5998  date-100 2022-11-03
5999  date-100 2022-12-03

[6000 rows x 2 columns]

CodePudding user response:

You may basically use date_range to create a series to be added:

dates = pd.Series(pd.date_range(start='1/3/2009', end='3/11/2011', freq='M'))

Just replace it with your desired start and end date. It will automatically increase months and years by keeping the last day of each month. If you want to set the day to a specific one:

dates.apply(lambda x: x.replace(day=3))

This returns the same but with day 3 for all entries. If you also want a larger series with repeated days, you may use repeat as:

dates.repeat(10).reset_index(drop=True)

So this way you will have the same series but each date is repeated 10 times.

  • Related