I have a dataframe in pandas. I want to normalize my Close column by dividing every row by the price on the first row. This is my attempt:
import pandas as pd
import yfinance as yf
bova=yf.download('BOVA11.SA',
start='2014-01-01',
end='2021-12-31',
progress=False,
)
bova
output:
Open High Low Close Adj Close Volume
Date
2014-01-02 50.220001 50.270000 49.060001 49.080002 49.080002 1001210
2014-01-03 49.099998 49.470001 49.049999 49.259998 49.259998 1227270
2014-01-06 49.259998 49.990002 49.200001 49.840000 49.840000 702060
2014-01-07 49.549999 50.290001 49.230000 49.230000 49.230000 1304100
2014-01-08 49.540001 49.590000 49.209999 49.279999 49.279999 951950
... ... ... ... ... ... ...
2021-12-23 101.599998 101.599998 100.709999 100.849998 100.849998 5047637
2021-12-27 101.400002 101.800003 100.949997 101.599998 101.599998 5274352
2021-12-28 101.830002 101.830002 100.589996 101.059998 101.059998 7167734
2021-12-29 101.239998 101.349998 99.940002 100.250000 100.250000 7343838
2021-12-30 100.809998 101.449997 100.290001 100.800003 100.800003 3404622
CodePudding user response:
So something like this?
for c in df.columns:
if c != 'Date':
df[c] = df[c]/df[c][0]
This will iterate every column of the dataframe except for Date
, and will divide it by the value of the corresponding column of the first row.
CodePudding user response:
Use:
bova['Norm Close'] = bova['Close'] / bova['Close'][0]
print(bova[['Close', 'Norm Close']])
# Output
Close Norm Close
Date
2014-01-02 49.080002 1.000000
2014-01-03 49.259998 1.003667
2014-01-06 49.840000 1.015485
2014-01-07 49.230000 1.003056
2014-01-08 49.279999 1.004075
... ... ...
2021-12-23 100.849998 2.054808
2021-12-27 101.599998 2.070090
2021-12-28 101.059998 2.059087
2021-12-29 100.250000 2.042583
2021-12-30 100.800003 2.053790
[1986 rows x 2 columns]