Home > Back-end >  How can I get column name from correlation list?
How can I get column name from correlation list?

Time:01-03

enter image description here

I want to get all column names that corr relationship is over 0.2 and lower than 0.8. Is there any way to do this?

CodePudding user response:

Using the example from the pandas docs, we can get the corr and filter with two conditions, take the index of the matches and output to list.

import pandas as pd
def histogram_intersection(a, b):
    v = np.minimum(a, b).sum().round(decimals=1)
    return v
df = pd.DataFrame([(.2, .3), (.0, .6), (.6, .0), (.2, .1)],
                  columns=['dogs', 'cats'])

c = abs(df.corr(method=histogram_intersection)['cats'])

print(c)
print(c[(c>.2) & (c<.8)].index.tolist())

Output

dogs    0.3
cats    1.0
Name: cats, dtype: float64

['dogs']

CodePudding user response:

You can index the Series corList with conditions, and retrieve the names with .index:

corList[(corList > 0.2) & (corList 0.8)].index

Or a possible more readable version:

corList[corList.gt(0.2) & corList.lt(0.8)].index
  • Related