How do I create a while loop for this df that has moving average in every stage?-CodePudding

So I want to spread the shipments per ID in the group one by one by looking at avg sales to determine who to give it to.

Here's my dataframe:

ID   STOREID    BAL       SALES   SHIP
1     STR1      50        5       18
1     STR2      6         7       18
1     STR3      74        4       18
2     STR1      35        3       500
2     STR2      5         4       500
2     STR3      54        7       500

While SHIP (grouped by ID) is greater than 0, calculate AVG (BAL/SALES) and the lowest AVG per group give 1 to its column BAL and 1 to its column final. And then repeat the process until SHIP is 0. The AVG would be different every stage which is why I wanted it to be a while loop.

Sample output of first round is below. So do this until SHIP is 0 and SUM of Final per ID is = to SHIP:

ID   STOREID   BAL   SALES  SHIP   AVG        Final
1    STR1      50    5      18     10         0
1    STR2      6     4      18     1.5        1
1    STR3      8     4      18     2          0
2    STR1      35    3      500    11.67      0
2    STR2      5     4      500    1.25       1
2    STR3      54    7      500    7.71       0

I've tried a couple of ways in SQL, I thought it would be better to do it in python but I haven't been doing a great job with my loop. Here's what I tried so far:

df['AVG'] = 0
df['FINAL'] = 0

for i in df.groupby(["ID"])['SHIP']:
    if i > 0:
          df['AVG'] = df['BAL'] / df['SALES']
          df['SHIP'] = df.groupby(["ID"])['SHIP']-1
          total = df.groupby(["ID"])["FINAL"].transform("cumsum")
          df['FINAL'] =   1
          df['A'] =   1
    else:
         df['FINAL'] = 0

CodePudding user response：

This was challenging because more than one row in the group can have the same average calculation. then it throws off the allocation.

This works on the example dataframe, if I understood you correctly.

d = {'ID': [1, 1, 1, 2,2,2], 'STOREID': ['str1', 'str2', 'str3','str1', 'str2', 'str3'],'BAL':[50, 6, 74, 35,5,54], 'SALES': [5, 7, 4, 3,4,7], 'SHIP': [18, 18, 18, 500,500,500]} 
df = pd.DataFrame(data=d)
df['AVG'] = 0
df['FINAL'] = 0

def calc_something(x):
    # print(x.iloc[0]['SHIP'])
    for i in range(x.iloc[0]['SHIP'])[0:500]:
          x['AVG'] = x['BAL'] / x['SALES']
          x['SHIP'] = x['SHIP']-1
          x = x.sort_values('AVG').reset_index(drop=True)
          # print(x.iloc[0, 2])
          x.iloc[0, 2] = x['BAL'][0]   1
          x.iloc[0, 6] = x['FINAL'][0]   1
    return x

df_final = df.groupby('ID').apply(calc_something).reset_index(drop=True).sort_values(['ID', 'STOREID'])
df_final

   ID STOREID  BAL  SALES  SHIP    AVG  FINAL
1   1    STR1   50      5     0 10.000      0
0   1    STR2   24      7     0  3.286     18
2   1    STR3   74      4     0 18.500      0
4   2    STR1  127      3     0 42.333     92
5   2    STR2  170      4     0 42.500    165
3   2    STR3  297      7     0 42.286    243