Home > Net >  Loop over data in column 1 from dataframe_1 & assign to rows of a different dataframe_2 in a new col
Loop over data in column 1 from dataframe_1 & assign to rows of a different dataframe_2 in a new col

Time:12-07

I have the below 2 dataframes

Dataframe_1

Salesguy limit
A 10
B 11
C 0
D 14
E 6

There is another dataframe2, which contains some shop details with 10 columns and say 1000 rows. I need to assign the salesguys to the rows in dataframe2 in a new 11th column in round robin manner (ABCDE ABCDE ..so on). But the assignment needs to stop once the corresponding limit (in column 2 of dataframe_1) for the salesguy is reached.

for ex - since limit for C is 0, the assignment should be ABDE ABDE,

after 6 iterations, it will become ABD ABD (as the limit for E after 6 iterations will be 0)

Can anyone please help with the python code for this ?

I am able to assign the salesguys in the round robin manner using a list

l = [A,B,C,D,E]
dataframe_2['New']=''
dataframe_2.loc['New']=l
But I am unable to figure how to use the column 2 to set the corresponding limits for each salesguy.

CodePudding user response:

You can replicate the values with Series.repeat, sort them in round robin with sort_values and groupby.cumcount:

df2['New'] = (df1['Salesguy'].repeat(df1['limit'])
              .sort_values(key=lambda s: s.groupby(s).cumcount(),
                           kind='stable', ignore_index=True)
             )

print(df2)

Example:

     dummy  New
0       82    A
1        2    B
2       11    D
3        7    E
4       58    A
..     ...  ...
995     35  NaN
996     32  NaN
997     89  NaN
998     36  NaN
999     81  NaN

[1000 rows x 2 columns]

Used input:

df2 = pd.DataFrame({'dummy': np.random.randint(0,100, size=1000)})

CodePudding user response:

Here is a lengthy code for doing it. After a series of for-loops and getting the list, you extend it to match the length of axis 0.

def get_salesguy_list(df1):
  dict1 = df1.set_index('Salesguy')['limit'].to_dict()
  dict2 = dict1.copy()
  lst = []
  for salesguy in dict1:
    for i in range(dict1[salesguy]):
      for s in dict1:
        if dict2[s] > 0:
  #         print(s, dict2[s])
          lst.append(s)
        dict2[s] -= 1
  return lst

a_list = get_salesguy_list(df1)

b_list = []

iter_range = int(df2.shape[0]/len(a_list))   1 # maths to get no. of repetative appendings

for i in range(iter_range):
  for item in a_list:
    b_list.append(item)
b_list = b_list[:df2.shape[0]] # discard the extra items

df2['col_11']  = pd.Series(b_list) # your column-11 of dataframe_2
  • Related