Using Pandas data frames, how do I subtract to find the differences in columns '(x$1000)' and 'PRN AMT' between data frames based on 'CUSIP'(which acts as a unique id)? The dataset I provided is a sample, so the solution must be able to contend with a different order. I've tried reading documentation on dataframe.subtract() and don't understand how to apply it to my specific problem.
data frame 1:
CUSIP (x$1000) PRN AMT TICKER
594918104 2765345852 2114080582 MSFT
037833100 1891440177 3058252946 AAPL
02079K305 1721936077 132543866 GOOGL
023135106 1341784239 2573051329 AMZN
data frame 2:
CUSIP (x$1000) PRN AMT TICKER
594918104 3034828140 1612323669 MSFT
037833100 2463247977 2628732382 AAPL
02079K305 2096049986 93429916 GOOGL
023135106 1581124222 118724459 AMZN
Wanted output:
CUSIP (x$1000) PRN AMT TICKER
594918104 -269482288 501756913 MSFT
037833100 -571807800 429520564 AAPL
02079K305 -374113909 39113950 GOOGL
023135106 -239339983 2454326870 AMZN
Here is the code to recreate the dataframes:
import pandas as pd
dataset_1 = {'CUSIP': ['594918104', '037833100', '02079K305', '023135106'], '(x$1000)': [
2765345852, 1891440177, 1721936077, 1341784239], 'PRN AMT': [2114080582, 3058252946, 132543866, 2573051329], 'TICKER': ['MSFT', 'AAPL', 'GOOGL', 'AMZN']}
dataset_2 = {'CUSIP': ['594918104', '037833100', '02079K305', '023135106'], '(x$1000)': [
3034828140, 2463247977, 2096049986, 1581124222], 'PRN AMT': [1612323669, 2628732382, 93429916, 118724459], 'TICKER': ['MSFT', 'AAPL', 'GOOGL', 'AMZN']}
df_1 = pd.DataFrame(data=dataset_1)
df_2 = pd.DataFrame(data=dataset_2)
print(f'{df_1} and {df_2}')
CodePudding user response:
Create MultiIndex
by CUSIP,TICKER
and subtract by DataFrame.sub
, last DataFrame.reset_index
and change order of columns by DataFrame.reindex
:
df = (df_1.set_index(['CUSIP','TICKER'])
.sub(df_2.set_index(['CUSIP','TICKER']))
.reset_index()
.reindex(df_1.columns, axis=1))
print (df)
CUSIP (x$1000) PRN AMT TICKER
0 594918104 -269482288 501756913 MSFT
1 037833100 -571807800 429520564 AAPL
2 02079K305 -374113909 39113950 GOOGL
3 023135106 -239339983 2454326870 AMZN