I have the below 2 dataframes
Dataframe_1
Salesguy | limit |
---|---|
A | 10 |
B | 11 |
C | 0 |
D | 14 |
E | 6 |
There is another dataframe2, which contains some shop details with 10 columns and say 1000 rows. I need to assign the salesguys to the rows in dataframe2 in a new 11th column in round robin manner (ABCDE ABCDE ..so on). But the assignment needs to stop once the corresponding limit (in column 2 of dataframe_1) for the salesguy is reached.
for ex - since limit for C is 0, the assignment should be ABDE ABDE,
after 6 iterations, it will become ABD ABD (as the limit for E after 6 iterations will be 0)
Can anyone please help with the python code for this ?
I am able to assign the salesguys in the round robin manner using a list
l = [A,B,C,D,E]
dataframe_2['New']=''
dataframe_2.loc['New']=l
But I am unable to figure how to use the column 2 to set the corresponding limits for each salesguy.
CodePudding user response:
You can replicate the values with Series.repeat
, sort them in round robin with sort_values
and groupby.cumcount
:
df2['New'] = (df1['Salesguy'].repeat(df1['limit'])
.sort_values(key=lambda s: s.groupby(s).cumcount(),
kind='stable', ignore_index=True)
)
print(df2)
Example:
dummy New
0 82 A
1 2 B
2 11 D
3 7 E
4 58 A
.. ... ...
995 35 NaN
996 32 NaN
997 89 NaN
998 36 NaN
999 81 NaN
[1000 rows x 2 columns]
Used input:
df2 = pd.DataFrame({'dummy': np.random.randint(0,100, size=1000)})
CodePudding user response:
Here is a lengthy code for doing it. After a series of for-loops and getting the list, you extend it to match the length of axis 0.
def get_salesguy_list(df1):
dict1 = df1.set_index('Salesguy')['limit'].to_dict()
dict2 = dict1.copy()
lst = []
for salesguy in dict1:
for i in range(dict1[salesguy]):
for s in dict1:
if dict2[s] > 0:
# print(s, dict2[s])
lst.append(s)
dict2[s] -= 1
return lst
a_list = get_salesguy_list(df1)
b_list = []
iter_range = int(df2.shape[0]/len(a_list)) 1 # maths to get no. of repetative appendings
for i in range(iter_range):
for item in a_list:
b_list.append(item)
b_list = b_list[:df2.shape[0]] # discard the extra items
df2['col_11'] = pd.Series(b_list) # your column-11 of dataframe_2