giving a unique code by seeing the first column string and the second columns string and whenever the first column string change it starts from 1
Example
how can I do this using python?
CodePudding user response:
Something like this could work:
Let's assume your data is stored in the data frame df
Split the data frame based on unique values in the first column string.
dfs = dict(tuple(df.groupby('colummen1')))
for _, df in dfs.items():
df['id'] = df.groupby(['colummen1','colummun2']).ngroup()
dfs = [df[1] for df in dfs]
df = pd.concat(dfs)
CodePudding user response:
You can use transform()
after groupby
like below:
import numpy as np
df = pd.DataFrame({'col1':['keb1', 'keb1','keb1', 'keb2','keb2'],
'col2': ['com1', 'com2', 'com3', 'com1', 'com2']})
df['id'] = df.groupby('col1')['col2'].transform(lambda x : np.arange(len(x)) 1)
Output:
>>> df
col1 col2 id
0 keb1 com1 1
1 keb1 com2 2
2 keb1 com3 3
3 keb2 com1 1
4 keb2 com2 2