I have this function
def highest_correlation(dataframe):
corr_table = dataframe.corr().unstack()
df_corrvalues = corr_table.sort_values(ascending=False)
return df_corrvalues
correlation = highest_correlation(heart)
correlation
This is the output
age age 1.000000
sex sex 1.000000
thall thall 1.000000
caa caa 1.000000
slp slp 1.000000
...
output oldpeak -0.429146
exng -0.435601
exng output -0.435601
slp oldpeak -0.576314
oldpeak slp -0.576314
Length: 196, dtype: float64
How can return the highest correlation values that are lower than 1?
That is, I want to remove the 1s that appear on the top when I use sort_values(ascending=False)
CodePudding user response:
Multiindex Series from the Pandas User Guide
import pandas as pd
from numpy.random import default_rng
rng = default_rng()
arrays = [
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
["one", "two", "one", "two", "one", "two", "one", "two"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
s = pd.Series(rng.standard_normal(8), index=index)
print(s)
Filter for values less than one.
print(s[s<1])
first second
bar one 1.602675
two -0.197277
baz one -0.746729
two 1.384208
foo one 1.587294
two -1.616769
qux one -0.872030
two -0.721226
dtype: float64
first second
bar two -0.197277
baz one -0.746729
foo two -1.616769
qux one -0.872030
two -0.721226
dtype: float64