Home > database >  Pandas interpolate within a groupby for one column
Pandas interpolate within a groupby for one column

Time:11-24

Similar to this question Pandas interpolate within a groupby but the answer to that question does the interpolate() for all columns. If I only want to limit the interpolate() to one column how do I do that?

Input

    filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   NaN     NaN
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

Expected Output

    filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   NaN     15
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

This attempt only returns val2 column but not the rest of the columns.

df = df.groupby('filename').apply(lambda group: group['val2'].interpolate(method='index'))

CodePudding user response:

A direct approach:

df = pd.read_clipboard() # clipboard contains OP sample data
# interpolate only on col "val2"
df["val2_interpolated"] = df[["filename","val2"]].groupby('filename')
.apply(lambda x:x) # WTF
.interpolate(method='linear')["val2"]

returns:

    filename  val1  val2  val2_interpolated
t
1  file1.csv   5.0  10.0               10.0
2  file1.csv   NaN   NaN               15.0
3  file1.csv  15.0  20.0               20.0
6  file2.csv   NaN   NaN               20.0
7  file2.csv  10.0  20.0               20.0
8  file2.csv  12.0  15.0               15.0
  • Related