Interpolate on Pandas groups if group ID is in list-CodePudding

I'm trying to interpolate based on grouped pandas data using lambda.

I know that with the below data:

df = pd.DataFrame({'ID':[1, 1, 1, 2, 2, 2, 3, 3, 3],
                   'Value' : [np.nan, 5, np.nan, 7, np.nan, 9, 1, 3, np.nan]})

   ID  Value
0   1    NaN
1   1    5.0
2   1    NaN
3   2    7.0
4   2    NaN
5   2    9.0
6   3    1.0
7   3    3.0
8   3    NaN

I can use lambda to interpolate within each group and fill in the gaps like this:

df.groupby('ID').apply(lambda group: group.interpolate(limit_direction='both'))

However, I only want to fill in the gaps on groups that are specified within a set list, for example:

int_IDs = [1, 3]

Such that those IDs have their blanks filled, but the remaining IDs don't, so the output would be:

   ID  Value
0   1    5.0
1   1    5.0
2   1    5.0
3   2    7.0
4   2    NaN
5   2    9.0
6   3    1.0
7   3    3.0
8   3    3.0

I've tried adding a conditional into the lambda, but that gives ValueError: The truth value of a DataFrame is ambiguous.:

df.groupby('ID').apply(lambda group: group.interpolate(limit_direction='both') if group in int_IDs else group)

How can I do this?

CodePudding user response：

Use a mask to restrict the groupby:

int_IDs = [1, 3]
m = df['ID'].isin(int_IDs)
df['New_Value'] = (df[m]
 .groupby('ID')['Value']
 .apply(lambda group: group.interpolate(limit_direction='both'))
 .combine_first(df['Value'])
)

output:

   ID  Value  New_Value
0   1    NaN        5.0
1   1    5.0        5.0
2   1    NaN        5.0
3   2    7.0        7.0
4   2    NaN        NaN
5   2    9.0        9.0
6   3    1.0        1.0
7   3    3.0        3.0
8   3    NaN        3.0

In place:

int_IDs = [1, 3]
m = df['ID'].isin(int_IDs)
df.loc[m, 'Value'] = (df[m]
 .groupby('ID')['Value']
 .apply(lambda group: group.interpolate(limit_direction='both'))
)

output:

   ID  Value
0   1    5.0
1   1    5.0
2   1    5.0
3   2    7.0
4   2    NaN
5   2    9.0
6   3    1.0
7   3    3.0
8   3    3.0