I have 2 columns in an excel sheet with columns and values below
Asset Ids | FADEL Ids |
---|---|
fcb03 | 12500,12502 |
f813a | 4062 |
d79ca | 16145 |
fa632 | 6698 |
9c7cd | 4062, 6698 |
4d787 | 18933 |
5569a | 4062,6698 |
2cd98 | 16145 |
5d58d | 9965 |
52b30 | 12500,12502 |
I want an excel sheet with the below output
Asset Ids | FADEL Ids |
---|---|
fcb03,52b30 | 12500 |
fcb03,52b30 | 12502 |
f813a,9c7cd,5569a | 4062 |
d79ca,2cd98 | 16145 |
fa632,9c7cd,5569a | 6698 |
4d787 | 18933 |
5d5bd | 9965 |
I want the answer through a python code. I tried implementing this through pandas dataframe and I am struck at this point.
CodePudding user response:
We could try with get_dummies
with dot
s = df.set_index('Asset Ids')['FADEL Ids'].str.get_dummies(',')
out = s.T.dot(s.index ',').str[:-1].reset_index()
out
Out[130]:
index 0
0 12500 fcb03,52b30
1 12502 fcb03,52b30
2 16145 d79ca,2cd98
3 18933 4d787
4 4062 f813a,9c7cd,5569a
5 6698 fa632,9c7cd,5569a
6 9965 5d58d
CodePudding user response:
Assuming df
, the input DataFrame, you can split
, explode
, and groupby.agg
:
out = (df
.assign(**{'FADEL Ids': df['FADEL Ids'].str.split(',')})
.explode('FADEL Ids')
.groupby('FADEL Ids', as_index=False)
['Asset Ids'].agg(','.join)
)
print(out)
output:
FADEL Ids Asset Ids
0 6698 9c7cd
1 12500 fcb03,52b30
2 12502 fcb03,52b30
3 16145 d79ca,2cd98
4 18933 4d787
5 4062 f813a,9c7cd,5569a
6 6698 fa632,5569a
7 9965 5d58d