Home > Mobile >  I want to make a new dataframe from changes from old dataframe
I want to make a new dataframe from changes from old dataframe

Time:03-30

I have a dataframe and i want to make a new dataframe containing features from the old dataframe.

Here is some dummy code:

import pandas as pd
import numpy as np

df_msft = [['2020-1-1', 10, 11], ['2020-1-2', 15, 20], ['2020-1-3', 14, 12]]
df1 = pd.DataFrame(df_msft , columns = ['datetime', 'price_open', 'price_close'])

#Making a 'features' dataframe which will contain the features of the stocks in question
features = pd.DataFrame(index=df1.datetime).sort_index()

features['daily_change'] = df1.price_close/df1.price_open-1 # daily return
features['pct_change_on_day'] = df1.price_open/df1.price_close.shift(1)-1

When i do this my 'features' dataframe is filled with NaN values, does anyone know why this is?

CodePudding user response:

Use:

features.loc[:, 'daily_change'] = (df1.price_close/df1.price_open)-1

CodePudding user response:

Your feature and df1 dataframes have different indexes. So when assigning, no alignment is found and your values are discarded.

You can overcome this by assigning the numpy array:

features['daily_change'] = (df1.price_close/df1.price_open-1).values # daily return
features['pct_change_on_day'] = (df1.price_open/df1.price_close.shift(1)-1).values

output:

          daily_change  pct_change_on_day
datetime                                 
2020-1-1      0.100000                NaN
2020-1-2      0.333333           0.363636
2020-1-3     -0.142857          -0.300000

or, better, make "datetime" the index of df1:

df1 = df1.set_index('datetime')
features['daily_change'] = (df1.price_close/df1.price_open-1) # daily return
features['pct_change_on_day'] = (df1.price_open/df1.price_close.shift(1)-1)
  • Related