Home > OS >  How do I change the name of a resampled column?
How do I change the name of a resampled column?

Time:03-11

I have a dataframe with the price fluctuations of the Nasdaq stock index every minute. In trading it is important to take into account data on different time units (to know the short term, medium and long term trends...)

So I used the resample() method of Pandas to get a dataframe with the price in 5 minutes in addition to the original 1 minute:

df1m = pd.DataFrame({
'Time' : ['2022-01-11 09:30:00', '2022-01-11 09:31:00', '2022-01-11 09:32:00', '2022-01-11 09:33:00', '2022-01-11 09:34:00', '2022-01-11 09:35:00', '2022-01-11 09:36:00' , '2022-01-11 09:37:00' , '2022-01-11 09:38:00' ,
       '2022-01-11 09:39:00', '2022-01-11 09:40:00'],  
'Price' : [1,2,3,4,5,6,7,8,9,10,11]})
df1m['Time'] = pd.to_datetime(df1m['Time'])
df1m.set_index(['Time'], inplace =True)

df5m = df1m.resample('5min').first() 

I renamed the column names to 5min :

df5m.rename(columns={'Price' : 'Price5'})

Unfortunately the change of column names is no longer taken into account when the two dataframes (1 and 5 min) are put together:

df_1m_5m = pd.concat([df1m, df5m], axis=1)

How to rename definitively the columns created for the 5min data and avoid having twice the same column name for different data?

CodePudding user response:

You can use:

df5m = df1m.resample('5min').first().add_suffix('5')
df_1m_5m = pd.concat([df1m, df5m], axis=1)

Output:

>>> df_1m_5m
                     Price  Price5
Time                              
2022-01-11 09:30:00      1     1.0
2022-01-11 09:31:00      2     NaN
2022-01-11 09:32:00      3     NaN
2022-01-11 09:33:00      4     NaN
2022-01-11 09:34:00      5     NaN
2022-01-11 09:35:00      6     6.0
2022-01-11 09:36:00      7     NaN
2022-01-11 09:37:00      8     NaN
2022-01-11 09:38:00      9     NaN
2022-01-11 09:39:00     10     NaN
2022-01-11 09:40:00     11    11.0

You forgot to reassign the result to your dataframe:

df5m = df5m.rename(columns={'Price' : 'Price5'})

# OR

df5m.rename(columns={'Price' : 'Price5'}, inplace=True)

Output:

>>> df5m
                     Price5
Time                       
2022-01-11 09:30:00       1
2022-01-11 09:35:00       6
2022-01-11 09:40:00      11

CodePudding user response:

Believe your issue is you are missing option inplace=true in your rename. By default it's false, so it generates a copy of the DataFrame rather than editing your existing DataFrame. Setting it to true will edit your existing DataFrame df5m

df5m.rename(columns={'Price' : 'Price5'},inplace=True)

Output of df_1m_5m:

                     Price  Price5
Time
2022-01-11 09:30:00      1     1.0
2022-01-11 09:31:00      2     NaN
2022-01-11 09:32:00      3     NaN
2022-01-11 09:33:00      4     NaN
2022-01-11 09:34:00      5     NaN
2022-01-11 09:35:00      6     6.0
2022-01-11 09:36:00      7     NaN
2022-01-11 09:37:00      8     NaN
2022-01-11 09:38:00      9     NaN
2022-01-11 09:39:00     10     NaN
2022-01-11 09:40:00     11    11.0
  • Related