Reversing 2 columns in an excel sheet with one column having a single variable and the other column-CodePudding

I have 2 columns in an excel sheet with columns and values below

Asset Ids	FADEL Ids
fcb03	12500,12502
f813a	4062
d79ca	16145
fa632	6698
9c7cd	4062, 6698
4d787	18933
5569a	4062,6698
2cd98	16145
5d58d	9965
52b30	12500,12502

I want an excel sheet with the below output

Asset Ids	FADEL Ids
fcb03,52b30	12500
fcb03,52b30	12502
f813a,9c7cd,5569a	4062
d79ca,2cd98	16145
fa632,9c7cd,5569a	6698
4d787	18933
5d5bd	9965

I want the answer through a python code. I tried implementing this through pandas dataframe and I am struck at this point.

CodePudding user response：

We could try with get_dummies with dot

s = df.set_index('Asset Ids')['FADEL Ids'].str.get_dummies(',')
out = s.T.dot(s.index ',').str[:-1].reset_index()
out
Out[130]: 
   index                  0
0  12500        fcb03,52b30
1  12502        fcb03,52b30
2  16145        d79ca,2cd98
3  18933              4d787
4   4062  f813a,9c7cd,5569a
5   6698  fa632,9c7cd,5569a
6   9965              5d58d

CodePudding user response：

Assuming df, the input DataFrame, you can split, explode, and groupby.agg:

out = (df
 .assign(**{'FADEL Ids': df['FADEL Ids'].str.split(',')})
 .explode('FADEL Ids')
 .groupby('FADEL Ids', as_index=False)
 ['Asset Ids'].agg(','.join)
)

print(out)

output:

  FADEL Ids          Asset Ids
0      6698              9c7cd
1     12500        fcb03,52b30
2     12502        fcb03,52b30
3     16145        d79ca,2cd98
4     18933              4d787
5      4062  f813a,9c7cd,5569a
6      6698        fa632,5569a
7      9965              5d58d