I would like to apply a function to specific columns of a pandas data frame. Here is an illustration:
# import modules
from pandas_datareader import data as pdr
# import parameters
start = "2020-01-01"
end = "2021-01-01"
symbols = ["AAPL"]
# get the data
data = pdr.get_data_yahoo(symbols, start, end)
def mult(row):
return row['Close']*2, row['Open']/3
data[['Close', 'Open']].apply(mult, axis = 1)
print(data.head())
The result:
Attributes Adj Close Close High Low Open Volume
Symbols AAPL AAPL AAPL AAPL AAPL AAPL
Date
2020-01-02 73.894333 75.087502 75.150002 73.797501 74.059998 135480400.0
2020-01-03 73.175926 74.357498 75.144997 74.125000 74.287498 146322800.0
2020-01-06 73.759003 74.949997 74.989998 73.187500 73.447502 118387200.0
2020-01-07 73.412109 74.597504 75.224998 74.370003 74.959999 108872000.0
2020-01-08 74.593048 75.797501 76.110001 74.290001 74.290001 132079200.0
Any thoughts as to why that doesn't work?
CodePudding user response:
Two things:
(i) You never assign it back to the original DataFrame, so it never gets updated.
(ii) If your function is not anymore complex, for simple multiplication, vectorized operation is better, so instead of the function, do the multiplication directly on the column:
data['Close'] *= 2
data['Open'] /= 3
CodePudding user response:
I think the problem is that you are not assigning the return of the mult
functions to any variable.
One way to achieve what you want is:
# import modules
from pandas_datareader import data as pdr
# import parameters
start = "2020-01-01"
end = "2021-01-01"
symbols = ["AAPL"]
# get the data
data = pdr.get_data_yahoo(symbols, start, end)
def mult(df):
df['Close'] = 2 * df['Close']
df['Open'] = df['Open'] / 3
return df
mult(data)
print(data.head())
Attributes Adj Close Close High Low Open \
Symbols AAPL AAPL AAPL AAPL AAPL
Date
2020-01-02 73.894325 150.175003 75.150002 73.797501 24.686666
2020-01-03 73.175926 148.714996 75.144997 74.125000 24.762499
2020-01-06 73.759010 149.899994 74.989998 73.187500 24.482501
2020-01-07 73.412117 149.195007 75.224998 74.370003 24.986666
2020-01-08 74.593048 151.595001 76.110001 74.290001 24.763334