Home > Mobile >  Create return dataframe from price dataframe
Create return dataframe from price dataframe

Time:11-18

I try to figure out an efficient way to create a DataFrame of returns (ReturnTable) based on prices (PriceTable). Is there a more efficient way than just iterating with a for loop over the columns?

Here I have a small example:

import pandas as pd
PriceTable = pd.DataFrame({
    'Dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
    'Stock1':[10, 12, 13.5, 11.5], 
    'Stock2':[5, 5.5, 'NaN', 'NaN'], 
    'Stock3':['NaN', 9, 9.5, 10.5], 
    'Stock4':[20, 20, 19.5, 15]}) 

I try to get a ReturnTable which is just the division of the price at time t and t-1.

ReturnTable = pd.DataFrame({
    'Dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
    'Stock1':[0, 1.2, 1.125, 0.852], 
    'Stock2':[0, 1.1, 'NaN', 'NaN'], 
    'Stock3':['NaN', 0, 1.056, 1.105], 
    'Stock4':[0, 1, 0.975, 0.769]})

Thanks a lot!

CodePudding user response:

You could use shift and a division:

NB. 'NaN' are strings, so you first need to convert to float('nan')

PriceTable = PriceTable.replace('NaN', float('nan'))
cols = PriceTable.select_dtypes('number').columns

ReturnTable = PriceTable.copy()

ReturnTable[cols] = PriceTable[cols]/PriceTable[cols].shift()

output:

        Dates    Stock1    Stock2    Stock3    Stock4
0  2021-01-01       NaN       NaN       NaN       NaN
1  2021-01-02    1.2000    1.1000       NaN    1.0000
2  2021-01-03    1.1250       NaN    1.0556    0.9750
3  2021-01-04    0.8519       NaN    1.1053    0.7692
  • Related