I have a dataframe:
user group period
1 1 1
2 2 1
3 2 1
1 1 2
2 1 2
3 1 2
4 2 2
I want to make a new column ("period1") in the dataframe that would show the group number to which a user was assigned in period = 1. So it should look like this:
user group period period1
1 1 1 1
2 2 1 2
3 2 1 2
1 1 2 1
2 1 2 2
3 1 2 2
4 2 2 null
I tried to use groupby with the condition inside but it didn't help me:
df['period1']=df['group'].groupby([df['user'],df['period']==1]).transform(lambda s: max(s))
Please help!
CodePudding user response:
Use Series.map
by filtere rows if period is 1
:
df['period1'] = df['user'].map(df[df['period']==1].set_index('user')['group'])
print (df)
user group period period1
0 1 1 1 1.0
1 2 2 1 2.0
2 3 2 1 2.0
3 1 1 2 1.0
4 2 1 2 2.0
5 3 1 2 2.0
6 4 2 2 NaN
If need integers:
df['period1']=df['user'].map(df[df['period']==1].set_index('user')['group']).astype('Int64')
print (df)
user group period period1
0 1 1 1 1
1 2 2 1 2
2 3 2 1 2
3 1 1 2 1
4 2 1 2 2
5 3 1 2 2
6 4 2 2 <NA>
CodePudding user response:
First you need to filter the dataframe for period 1:
period1 = df[df['period'] == 1]
Then merge between the two:
df = df.merge(right=period1[['user','period']], on='user', how = 'left')