I have never used multiIndex pandas dataframes, and the Python API for CalcBench, a financial program I use, returned one when I typed the following code:
dataItems = ["ROE", "StockholdersEquity", "SharesOutstandingEndOfPeriod", "EndOfPeriodStockPrice"]
data = cb.normalized_data(tickers, dataItems, start_year=2021, end_year=2021, period_type='annual')
If I try to explore my dataframe by typing data.info()
, I get:
<class 'pandas.core.frame.DataFrame'>
PeriodIndex: 1 entries, 2021 to 2021
Freq: A-DEC
Columns: 1800 entries, ('EndOfPeriodStockPrice', 'A') to ('StockholdersEquity', 'ZTS')
dtypes: float64(1800)
Typing data.columns()
gives me:
MultiIndex([('EndOfPeriodStockPrice', 'A'),
('EndOfPeriodStockPrice', 'AAL'),
('EndOfPeriodStockPrice', 'AAP'),
('EndOfPeriodStockPrice', 'AAPL'),
('EndOfPeriodStockPrice', 'ABBV'),
('EndOfPeriodStockPrice', 'ABC'),
('EndOfPeriodStockPrice', 'ABT'),
('EndOfPeriodStockPrice', 'ACN'),
('EndOfPeriodStockPrice', 'ADBE'),
('EndOfPeriodStockPrice', 'ADI'),
...
( 'StockholdersEquity', 'WYNN'),
( 'StockholdersEquity', 'XEL'),
( 'StockholdersEquity', 'XOM'),
( 'StockholdersEquity', 'XRAY'),
( 'StockholdersEquity', 'XYL'),
( 'StockholdersEquity', 'YUM'),
( 'StockholdersEquity', 'ZBH'),
( 'StockholdersEquity', 'ZBRA'),
( 'StockholdersEquity', 'ZION'),
( 'StockholdersEquity', 'ZTS')],
names=['metric', 'ticker'], length=1800)
I would like to create a new metric MarketCapAtEndOfPeriod
in this dataframe for each firm in the year 2021 by multiplying the corresponding entries for EndofPeriodStockPrice
and SharesOutstandingEndOfPeriod
.
My efforts to do so have gone nowhere, and searching StackOverflow has not helped me solve my problem. Loosely speaking, I'm looking to write something like:
data["MarketCapAtEndOfPeriod"] = data["EndofPeriodStockPrice"] * data["SharesOutstandingEndOfPeriod"]
CodePudding user response:
market_cap = data.EndOfPeriodStockPrice * data.SharesOutstandingEndOfPeriod
market_cap_df = pd.concat(
{"MarketCap": market_cap},
names=["metric"],
axis=1,
)
data = data.join(market_cap_df)
CodePudding user response:
multiplying the corresponding entries for
EndofPeriodStockPrice
andSharesOutstandingEndOfPeriod
Use DataFrame.xs
to access MultiIndex cross-sections by specifying the label and an optional axis/level.
- In your case, the target labels are in the first level (default), so you can leave out
level=0
and just specifyaxis=1
for columns. - To assign the output as new columns, reconstruct the MultiIndex with
MultiIndex.from_product
before rejoining with the original dataframe.
key1 = 'EndofPeriodStockPrice'
key2 = 'SharesOutstandingEndOfPeriod'
new = 'MarketCapAtEndOfPeriod'
prod = data.xs(key1, axis=1) * data.xs(key2, axis=1)
prod.columns = pd.MultiIndex.from_product([[new], [key1, key2]])
# MultiIndex([('MarketCapAtEndOfPeriod', 'EndofPeriodStockPrice'),
# ('MarketCapAtEndOfPeriod', 'SharesOutstandingEndOfPeriod')])
df = df.join(prod)
If hypothetically you wanted to divide corresponding AAPL
and ABC
entries, those are in the second column level, so specify both axis=1
and level=1
:
data.xs('AAPL', axis=1, level=1) / data.xs('ABC', axis=1, level=1)