I have the following data that I would like to add a new column to that is the current month-over-month percent change. The date is the index in my dataframe
date close
1/26/1990 421.2999878
1/29/1990 418.1000061
1/30/1990 410.7000122
1/31/1990 415.7999878
2/23/1990 419.5
2/26/1990 421
2/27/1990 422.6000061
2/28/1990 425.7999878
3/26/1990 438.7999878
3/27/1990 439.5
3/28/1990 436.7000122
3/29/1990 435.3999939
3/30/1990 435.5
The simplest way I could think to do this is to add a column that will contain the previous month end date and also for convenience, the previous month-end 'close' value - and from that I can calculate the current month-over-month change. So in the end, i would have a table that looks like this:
I was able to add the previous month end just fine, but I am having problems now trying to look up the previous month-end close based on the previous month end date. In the the code below, the first line works fine to add the previous month end date as a new column. But the second does not - the idea is to use the prev_month_end date to look up the the month end close value and add that as a column.
df['prev_month_end'] = df.index pd.offsets.BMonthEnd(-1)
df['prev_month_close'] = df[df.index == df['prev_month_end']]
Any help or suggestions on how to accomplish this would be appreciated.
CodePudding user response:
You can have prev_month_close
as follows:
df.reset_index(inplace=True)
df = df[['date', 'close', 'prev_month_end']].merge(df[['date', 'close']].rename(columns={'close': 'prev_month_close',
'date': 'prev_month_end'}),
how='left', on='prev_month_end')
OUTPUT
date close prev_month_end prev_month_close
0 1990-01-26 421.299988 1989-12-29 NaN
1 1990-01-29 418.100006 1989-12-29 NaN
2 1990-01-30 410.700012 1989-12-29 NaN
3 1990-01-31 415.799988 1989-12-29 NaN
4 1990-02-23 419.500000 1990-01-31 415.799988
5 1990-02-26 421.000000 1990-01-31 415.799988
6 1990-02-27 422.600006 1990-01-31 415.799988
7 1990-02-28 425.799988 1990-01-31 415.799988
8 1990-03-26 438.799988 1990-02-28 425.799988
9 1990-03-27 439.500000 1990-02-28 425.799988
10 1990-03-28 436.700012 1990-02-28 425.799988
11 1990-03-29 435.399994 1990-02-28 425.799988
12 1990-03-30 435.500000 1990-02-28 425.799988
OR without using reset_index
df = df[['close', 'prev_month_end']].merge(df[['close']].rename(columns={'close': 'prev_month_close'}),
how='left', left_on='prev_month_end', right_index=True)
OUTPUT
close prev_month_end prev_month_close
date
1990-01-26 421.299988 1989-12-29 NaN
1990-01-29 418.100006 1989-12-29 NaN
1990-01-30 410.700012 1989-12-29 NaN
1990-01-31 415.799988 1989-12-29 NaN
1990-02-23 419.500000 1990-01-31 415.799988
1990-02-26 421.000000 1990-01-31 415.799988
1990-02-27 422.600006 1990-01-31 415.799988
1990-02-28 425.799988 1990-01-31 415.799988
1990-03-26 438.799988 1990-02-28 425.799988
1990-03-27 439.500000 1990-02-28 425.799988
1990-03-28 436.700012 1990-02-28 425.799988
1990-03-29 435.399994 1990-02-28 425.799988
1990-03-30 435.500000 1990-02-28 425.799988
CodePudding user response:
We can convert the index to period index
, then group
the dataframe by period and aggregate close
using last
, then shift
the period index one month back and map
it with the closing values, finally calculate the percent change
i = pd.to_datetime(df.index).to_period('M')
s = i.shift(-1).map(df.groupby(i)['close'].last())
df['mom_pct_change'] = df['close'].sub(s).div(s).mul(100)
close mom_pct_change
date
1/26/1990 421.299988 NaN
1/29/1990 418.100006 NaN
1/30/1990 410.700012 NaN
1/31/1990 415.799988 NaN
2/23/1990 419.500000 0.889854
2/26/1990 421.000000 1.250604
2/27/1990 422.600006 1.635406
2/28/1990 425.799988 2.405002
3/26/1990 438.799988 3.053077
3/27/1990 439.500000 3.217476
3/28/1990 436.700012 2.559893
3/29/1990 435.399994 2.254581
3/30/1990 435.500000 2.278068