I have a dataframe:
id 1 2 3 5
1 1 0.6 0.2 0.9
2 0.6 1 0.4 0.8
3 0.2 0.4 1 0.2
5 0.9 0.8 0.2 1
columns are id, 1, 2, 3, 5. I want to extract pairs of value from column id and other columns with values higher than 0.7. So desired result is:
id1 id2 value
1 5 0.9
2 5 0.8
How to do that? Thanks in advance
CodePudding user response:
You could use numpy.triu
where
stack
:
import numpy as np
df = df.set_index('id')
out = df.where(np.triu(df.to_numpy()>0.7, k=1)).stack()\
.rename_axis(['id1','id2']).reset_index(name='value')
Output:
id1 id2 value
0 1 5 0.9
1 2 5 0.8