Home > Enterprise >  Fill by group and between two values
Fill by group and between two values

Time:12-01

I want to fill all rows between two values by group. For each group, var1 has two values equal to 1, and I want to fill the missing rows between the two 1s. var1 represents what I have, var2 represents what I want, var3 shows what I am obtaining with my code, but it is not what I want (different from var2):

var1 group var2 var3
NaN  1     NaN  NaN
NaN  1     NaN  NaN
1    1     1    1
NaN  1     1    1
NaN  1     1    1
1    1     1    1
NaN  1     NaN  1
NaN  1     NaN  1   
1    2     1    1
NaN  2     1    1
1    2     1    1
NaN  2     NaN  1

My code:

df.var3 = df.groupby('group')['var1'].bffill()

CodePudding user response:

Assuming the values are only 1 or NaN, you can groupby.ffill and groupby.bfill and only keep the values that are identical:

g = df.groupby('group')['var1']

s1 = g.ffill()
s2 = g.bfill()

df['var2'] = s1.where(s1.eq(s2))

Output:

    var1  group  var2
0    NaN      1   NaN
1    NaN      1   NaN
2    1.0      1   1.0
3    NaN      1   1.0
4    NaN      1   1.0
5    1.0      1   1.0
6    NaN      1   NaN
7    NaN      1   NaN
8    1.0      2   1.0
9    NaN      2   1.0
10   1.0      2   1.0
11   NaN      2   NaN

Intermediates:

    var1  group  var2  ffill  bfill
0    NaN      1   NaN    NaN    1.0
1    NaN      1   NaN    NaN    1.0
2    1.0      1   1.0    1.0    1.0
3    NaN      1   1.0    1.0    1.0
4    NaN      1   1.0    1.0    1.0
5    1.0      1   1.0    1.0    1.0
6    NaN      1   NaN    1.0    NaN
7    NaN      1   NaN    1.0    NaN
8    1.0      2   1.0    1.0    1.0
9    NaN      2   1.0    1.0    1.0
10   1.0      2   1.0    1.0    1.0
11   NaN      2   NaN    1.0    NaN
  • Related