Home > Net >  Fill null and next value with avarge value
Fill null and next value with avarge value

Time:03-07

i work with customers consumptions and sometime didn't have this consumption for month or more so the first consumption after that need to break it down into those months example

df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
                   'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
                   'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})

bfill() return same value not mean (value/count of null 1) desired value

'c':[100,130,133,133,133,140,105,500,550,550,550,550,550,550]

CodePudding user response:

You can try something like this:

df = pd.DataFrame({'customerId':[1,1,1,1,1,1,1,2,2,2,2,2,2,2],
                   'month':['2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01','2021-10-01','2021-11-01','2021-12-01','2022-01-01','2022-02-01','2022-03-01','2022-04-01'],
                   'consumption':[100,130,0,0,400,140,105,500,0,0,0,0,0,3300]})

df['grp'] = df['consumption'].ne(0)[::-1].cumsum()

df['c'] = df.groupby(['customerId', 'grp'])['consumption'].transform('mean')

df

Output:

    customerId       month  consumption  grp           c
0            1  2021-10-01          100    7  100.000000
1            1  2021-11-01          130    6  130.000000
2            1  2021-12-01            0    5  133.333333
3            1  2022-01-01            0    5  133.333333
4            1  2022-02-01          400    5  133.333333
5            1  2022-03-01          140    4  140.000000
6            1  2022-04-01          105    3  105.000000
7            2  2021-10-01          500    2  500.000000
8            2  2021-11-01            0    1  550.000000
9            2  2021-12-01            0    1  550.000000
10           2  2022-01-01            0    1  550.000000
11           2  2022-02-01            0    1  550.000000
12           2  2022-03-01            0    1  550.000000
13           2  2022-04-01         3300    1  550.000000

Details:

  • Create a group by checking for zero, the do a cumsum in reverse order to group zeroes with the next non-zero value.
  • Groupby that group and transform mean to distribute that non-zero value across zeroes.
  • Related