I want to modify the Monthly_idxs
so that it outputs the monthly data ranges starting from the beginning minute of the month -01 00:00:00 00:00
instead of the current output. I want to also include the month of the initial index of which is October but the output starts the initial Monthly_idxs
from November. How would I be able to get the Expected Output below?
import pandas as pd
# Creates 1 minute data range between date_range(a, b)
l = (pd.DataFrame(columns=['NULL'],
index=pd.date_range('2015-10-08T13:40:00Z', '2016-01-04T21:00:00Z',
freq='1T'))
.index.strftime('%Y-%m-%dT%H:%M:%SZ')
.tolist()
)
#Month Indexes
Monthly_idxs = pd.date_range(l[0], l[-1], freq='MS')
Output:
['2015-11-01 13:40:00 00:00', '2015-12-01 13:40:00 00:00',
'2016-01-01 13:40:00 00:00']
Expected Output:
['2015-10-01 00:00:00 00:00', '2015-11-01 00:00:00 00:00','2015-12-01 00:00:00 00:00'
'2016-01-01 00:00:00 00:00']
CodePudding user response:
Your list conversion occurs too soon. You can use resample
on your dataframe and then use format
to get the string list of your resampled index:
df = pd.DataFrame(columns=['NULL'],
index=pd.date_range('2015-10-08T13:40:00Z', '2016-01-04T21:00:00Z',
freq='1T'))
Month_begin = df.resample('MS').asfreq()
Monthly_idxs = Month_begin.index.format()
print(Monthly_idxs)
Output:
['2015-10-01 00:00:00 00:00', '2015-11-01 00:00:00 00:00', '2015-12-01 00:00:00 00:00', '2016-01-01 00:00:00 00:00']
CodePudding user response:
We can write Monthly_idxs
using round
and DateOffset
to get the expected result :
from pandas.tseries.offsets import DateOffset
Monthly_idxs = pd.date_range(pd.Timestamp(min(l)).round('1d') - DateOffset(months=1), pd.Timestamp(max(l)).round('1d'), freq='MS').strftime("%Y-%m-%d %H:%M:%S%z").tolist()
Output :
['2015-10-01 00:00:00 0000',
'2015-11-01 00:00:00 0000',
'2015-12-01 00:00:00 0000',
'2016-01-01 00:00:00 0000']
Thanks to @MrFuppes for the DateOffset
idea.