I have a dataframe:
task_name task_id project_id user
zoo 10 100 Nan
zoo 11 110 Nan
foo 22 100 Nan
foo 23 110 Nan
xyz 33 100 Nan
xyz 34 110 Nan
qwe 40 100 Nan
Where task_name
common, task_id
is unique and proejct_id
two variables. So same task_name
in two dif project_id
and unique task_id
.
I need to assign to column user
at users but with several condition:
First: all users have coefficient(factor) how many tasks can be assign to this user, for example only 4 from this df. Second: for each user can be assign only one task_name
, for example user 'dude' can be assign to task_name
zoo with task_id
10 in project_id
100 but not in both task_name
zoo with task_id
10 and 11 in project_id
100 and 110. So I need check no double assign in same task_name.
My expect output:
task_name task_id project_id user
zoo 10 100 dude
zoo 11 110 user2
foo 22 100 dude
foo 23 110 user2
xyz 33 100 dude
xyz 34 110 user2
qwe 40 100 dude
I trying this but without success:
df.apply(lambda x: 'dude' if x['project_id'] == 100 else np.nan, axis=1)
df.sort_values(by='project_id', inplace=True)
df['user'][0:4] = 'dude'
CodePudding user response:
From sample data seems solution should be simplify - if project_id is 100
assign dude
else user2
:
df['user'] = np.where(df['project_id'] == 100,'dude','user2')
If need assign first N
values in sorted values:
df.sort_values(by='project_id', inplace=True, ignore_index=True)
df['user'] = np.where(df.index < 4, 'dude','user2')
print (df)
task_name task_id project_id user
0 zoo 10 100 dude
1 foo 22 100 dude
2 xyz 33 100 dude
3 qwe 40 100 dude
4 zoo 11 110 user2
5 foo 23 110 user2
6 xyz 34 110 user2