Home > OS >  How to extract pair values from dataframe with condition?
How to extract pair values from dataframe with condition?

Time:04-05

I have a dataframe:

id   1    2    3    5
1    1   0.6  0.2  0.9
2   0.6  1    0.4  0.8
3   0.2  0.4  1    0.2
5   0.9  0.8  0.2  1

columns are id, 1, 2, 3, 5. I want to extract pairs of value from column id and other columns with values higher than 0.7. So desired result is:

id1     id2     value
 1       5      0.9
 2       5      0.8

How to do that? Thanks in advance

CodePudding user response:

You could use numpy.triu where stack:

import numpy as np
df = df.set_index('id')
out = df.where(np.triu(df.to_numpy()>0.7, k=1)).stack()\
        .rename_axis(['id1','id2']).reset_index(name='value')

Output:

   id1 id2  value
0    1   5    0.9
1    2   5    0.8
  • Related