Home > Software engineering >  Pandas fillna() rows in a specific order
Pandas fillna() rows in a specific order

Time:12-31

I have problem with fillna() method. This is my example df, which represents quantity of items in a shop. I would like to fill all the NaNs. If there is a NaN, I would like to fill it with values from previous day or if it is NaN, then from the next day (the same product, the same shop). If all days for specific product and shop are NaNs, then I would like to fill it with 0. I am looking for the best pandas way to do it, I had some ideas with loops, but it doesn't look very well.

My df:

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       NaN
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       NaN
4     2  shop_A  bananas       NaN
5     3  shop_A  bananas       NaN
6     1  shop_B   apples       NaN
7     2  shop_B   apples       NaN
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       NaN
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

Expected df:

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       3.0
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       0.0
4     2  shop_A  bananas       0.0
5     3  shop_A  bananas       0.0
6     1  shop_B   apples       2.0
7     2  shop_B   apples       2.0
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       4.0
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

I also tried fillna(limit=3), but this isn't what am I looking for.

CodePudding user response:

You can sort by day using sort_values and then perform a grouped bfill and then what's left will just get a 0 with chaining a fillna(0):

df['quantity'] = df.sort_values(by='day')\
                 .groupby(['shop','product'])['quantity'].bfill(limit=3).fillna(0)

prints back:

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       1.0
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       0.0
4     2  shop_A  bananas       0.0
5     3  shop_A  bananas       0.0
6     1  shop_B   apples       2.0
7     2  shop_B   apples       2.0
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       4.0
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

This will give a the value of the next day to the NaN value of the previous day for each shop and product. You can similarly use ffill (or both), and perhaps linear interpolation, and your results will change accordingly. This is however what you need to get you started.

  • Related