Home > Net >  Returns per day and per company calculation in dataframe
Returns per day and per company calculation in dataframe

Time:10-16

I have a dataframe looking like this:

Index       date        company_tic      price
0        2020-01-01         xyz          20.89
1        2021-02-24         zyx          18.15
2        2020-01-02         xyz          20.14
3        2021-03-05         abc          28.19
4        2021-02-28         zyx          14.99
5        2021-03-06         abc          15.78

For each company (company_tic), I want to calculate the daily returns, with the following formula: ((price today - price 1 day prior) / price 1 day prior).

I would like the returns to appear in a new column in the existing dataframe. If data is not available on the day prior, show np.nan.

Anyone knows how to solve this?

CodePudding user response:

This should work:

import numpy as np
import pandas as pd


df = pd.DataFrame.from_dict(
    [
        {'date': '2020-01-01', 'company_tic': 'xyz', 'price': 20.89},
        {'date': '2021-02-24', 'company_tic': 'zyx', 'price': 18.15},
        {'date': '2020-01-02', 'company_tic': 'xyz', 'price': 20.14},
        {'date': '2021-03-05', 'company_tic': 'abc', 'price': 28.19},
        {'date': '2021-02-28', 'company_tic': 'zyx', 'price': 14.99},
        {'date': '2021-03-06', 'company_tic': 'abc', 'price': 15.78}
    ]
)
df['date'] = pd.to_datetime(df['date']).dt.date

for i, row in df.iterrows():
    prior_date = row.date - datetime.timedelta(days=1)
    price_prior = df[(df.date==prior_date) & (df.company_tic == row.company_tic)].price
    price_prior = float(price_prior) if not price_prior.empty else np.nan
    df.loc[i, 'daily_return'] = (row.price - price_prior) / price_prior

print(df)
         date company_tic  price  daily_return
0  2020-01-01         xyz  20.89           NaN
1  2021-02-24         zyx  18.15           NaN
2  2020-01-02         xyz  20.14     -0.035902
3  2021-03-05         abc  28.19           NaN
4  2021-02-28         zyx  14.99           NaN
5  2021-03-06         abc  15.78     -0.440227

CodePudding user response:

You can use a groupby.shift with a DatetimeIndex to access the prior day's price. Then perform the computation:

# ensure datetime
df['date'] = pd.to_datetime(df['date'])

prev = (df[['date', 'company_tic']]
        .merge(
 df.set_index('date')
   .groupby('company_tic')
   ['price']
   .shift(1, 'D').reset_index(),
    how='left'
 )['price']
)

df['return'] = df['price'].sub(prev).div(prev)

Output:

        date company_tic  price    return
0 2020-01-01         xyz  20.89       NaN
1 2021-02-24         zyx  18.15       NaN
2 2020-01-02         xyz  20.14 -0.035902
3 2021-03-05         abc  28.19       NaN
4 2021-02-28         zyx  14.99       NaN
5 2021-03-06         abc  15.78 -0.440227
  • Related