I have the input in the following format. I wanted to assign a group number to mapped Col A to ColB. For example if you consider 222 in COL A, which is pointing to 111 and 112 in Col B, I want all the three numbers to be assigned to same group number. Please help me with this. Thanks in advance
CodePudding user response:
You can use factorize
to create the group and melt
to reshape:
out = (df
.assign(grpnumber=pd.factorize(df['colA'])[0] 1)
.melt('grpnumber', value_name='ID')
.drop_duplicates()
.sort_values(by='grpnumber', ignore_index=True)
.drop(columns='variable')
)
Output:
grpnumber ID
0 1 222
1 1 111
2 1 112
3 2 333
4 2 114
5 2 115
CodePudding user response:
Check Below code using PANDAS
import pandas as pd
def list_series(val):
return list(val)
df = df.groupby(['COL A']).agg(GRP_COL_B=('COL B',list_series)).reset_index()
df['ID'] = df.apply(lambda x: x['GRP_COL_B'] [x['COL A']], axis=1)
df['grpnumber'] = df.index 1
df[['ID','grpnumber']].explode('ID')
Output: