Home > Back-end >  How to check which two variables are more correlated to variable "Price"?
How to check which two variables are more correlated to variable "Price"?

Time:01-06

I have a Data Frame with 24 columns. I have to show the correlation among all the variables graphically. And I have to find the two most correlated variables to the variable "Price". I got the way a show this Data Frame as a graph:

import pandas as pd 
import seaborn as sn 
import matplotlib.pyplot as plt

# Create the correlation:

df4 = df_csv.corr()

# Show the graph:

sn.heatmap(df4, annot=True) 
plt.show()

But due to the graph being so busy with values and I can not check which two variables are more correlated to "Price" at a glance. How can I get some kind of filter to check easier the two most correlated variables to "Price"?

CodePudding user response:

I think you are almost there. You have all the values in your df4. Select the column of interest, sort the values and select the first three rows:

df4["col_of_interest"].sort_values(key=abs, ascending=False).head(3)

As mentioned by Mustafa Aydin in the comments this approach assumes that strongest correlation is independent of the sign. Otherwise remove the key=abs part.

  • Related