This is my dataset:
Dept | Cell culture | Bioinfo | Immunology | Trigonometry | Algebra | Biotech | Optics |
---|---|---|---|---|---|---|---|
Biotech | 1 | 1 | 1 | 0 | 0 | 1 | 0 |
Math | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
Physics | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
How I want my result:
Dept | 0 |
---|---|
Biotech | Cell culture |
Biotech | Bioinfo |
Biotech | Immunology |
Math | Trigonometry |
Math | Algebra |
Physics | Optics |
I need to form pairs that have the value one, but I also need to rid of those values which are the same in both column and row index - such as biotech here. Is there an easy way to do this?
CodePudding user response:
Try this:
#df = df.set_index('Dept') if needed move dept into the index
df.dot(df.columns ',').str.strip(',').str.split(',').explode().reset_index()
Output:
Dept 0
0 Biotech Cell culture
1 Biotech Bioinfo
2 Biotech Immunology
3 Biotech Biotech
4 Math Trigonometry
5 Math Algebra
6 Physics Optics