Home > Mobile >  count uniq ids between column and index (like pivot table)
count uniq ids between column and index (like pivot table)

Time:07-23

I have a table:

id Name
1 A
2 A
2 B
2 C
3 B
3 C
4 C
4 A

And I need a new table where unique IDs are calculated at the intersection of names. That is, for example, ID 2 has both A and B, that is, at the intersection of A and B, there is already one ID. Something like this:

Name A B C
A - 1 2
B 1 - 2
C 2 2 -

I tried with pandas.pivot_table, but i couldn't understand how

CodePudding user response:

Let us use crosstab to create frequency table, then calculate the inner product to generate similarity / intersection matrix and mask the diagonal values with 0

s = pd.crosstab(df['Name'], df['id']).clip(upper=1)
s = s @ s.T
np.fill_diagonal(s.values, 0)

print(s)

Name  A  B  C
Name         
A     0  1  2
B     1  0  2
C     2  2  0
  • Related