Home > Enterprise >  How do I create a while loop for this df that has moving average in every stage?
How do I create a while loop for this df that has moving average in every stage?

Time:11-16

So I want to spread the shipments per ID in the group one by one by looking at avg sales to determine who to give it to.

Here's my dataframe:

ID   STOREID    BAL       SALES   SHIP
1     STR1      50        5       18
1     STR2      6         7       18
1     STR3      74        4       18
2     STR1      35        3       500
2     STR2      5         4       500
2     STR3      54        7       500

While SHIP (grouped by ID) is greater than 0, calculate AVG (BAL/SALES) and the lowest AVG per group give 1 to its column BAL and 1 to its column final. And then repeat the process until SHIP is 0. The AVG would be different every stage which is why I wanted it to be a while loop.

Sample output of first round is below. So do this until SHIP is 0 and SUM of Final per ID is = to SHIP:

ID   STOREID   BAL   SALES  SHIP   AVG        Final
1    STR1      50    5      18     10         0
1    STR2      6     4      18     1.5        1
1    STR3      8     4      18     2          0
2    STR1      35    3      500    11.67      0
2    STR2      5     4      500    1.25       1
2    STR3      54    7      500    7.71       0

I've tried a couple of ways in SQL, I thought it would be better to do it in python but I haven't been doing a great job with my loop. Here's what I tried so far:

df['AVG'] = 0
df['FINAL'] = 0

for i in df.groupby(["ID"])['SHIP']:
    if i > 0:
          df['AVG'] = df['BAL'] / df['SALES']
          df['SHIP'] = df.groupby(["ID"])['SHIP']-1
          total = df.groupby(["ID"])["FINAL"].transform("cumsum")
          df['FINAL'] =   1
          df['A'] =   1
    else:
         df['FINAL'] = 0

CodePudding user response:

This was challenging because more than one row in the group can have the same average calculation. then it throws off the allocation.

This works on the example dataframe, if I understood you correctly.

d = {'ID': [1, 1, 1, 2,2,2], 'STOREID': ['str1', 'str2', 'str3','str1', 'str2', 'str3'],'BAL':[50, 6, 74, 35,5,54], 'SALES': [5, 7, 4, 3,4,7], 'SHIP': [18, 18, 18, 500,500,500]} 
df = pd.DataFrame(data=d)
df['AVG'] = 0
df['FINAL'] = 0

def calc_something(x):
    # print(x.iloc[0]['SHIP'])
    for i in range(x.iloc[0]['SHIP'])[0:500]:
          x['AVG'] = x['BAL'] / x['SALES']
          x['SHIP'] = x['SHIP']-1
          x = x.sort_values('AVG').reset_index(drop=True)
          # print(x.iloc[0, 2])
          x.iloc[0, 2] = x['BAL'][0]   1
          x.iloc[0, 6] = x['FINAL'][0]   1
    return x

df_final = df.groupby('ID').apply(calc_something).reset_index(drop=True).sort_values(['ID', 'STOREID'])
df_final

   ID STOREID  BAL  SALES  SHIP    AVG  FINAL
1   1    STR1   50      5     0 10.000      0
0   1    STR2   24      7     0  3.286     18
2   1    STR3   74      4     0 18.500      0
4   2    STR1  127      3     0 42.333     92
5   2    STR2  170      4     0 42.500    165
3   2    STR3  297      7     0 42.286    243
  • Related