Assuming I have the following df
:
Company Apples Mangoes Oranges
Amazon 0.75 0.6 0.98
BellTM 0.23 0.75 0.14
Cadbury 0.4 0.44 0.86
and then another data frame called vendor
:
Company Apples Mangoes Oranges
Deere 0.11 0.3 0.79
I want to find the row-wise correlation of each company with the company Deere
in the vendor
data frame. I want the outputted correlation coefficient added as a column called Correlationcoef to the original data frame df:
Company Apples Mangoes Oranges Corrcoef
Amazon 0.75 0.6 0.98 0.77955981
BellTM 0.23 0.75 0.14 -0.37694478
Cadbury 0.4 0.44 0.86 0.98092707
When I attempt the following:
df.iloc[:,1:].corrwith(vendor.iloc[:,1:], axis=1)
I get a list with NaN values.
I obtained the Corrcoef values manually by saving each row as an array and using np.corrcoef(x1,y)
CodePudding user response:
You need to use a Series in corrwith.
You can use:
df.set_index('Company').corrwith(vendor.set_index('Company').loc['Deere'], axis=1)
output:
Company
Amazon 0.779560
BellTM -0.376945
Cadbury 0.980927
dtype: float64
With your code:
df.iloc[:, 1:].corrwith(vendor.iloc[0,1:].astype(float), axis=1)
output:
0 0.779560
1 -0.376945
2 0.980927
dtype: float64