Home > Back-end >  Add missing rows per group to fill a series
Add missing rows per group to fill a series

Time:11-11

I have the following df:

Name Time Elapsed Amount
A 2 $2
A 5 $1
A 7 $6
B 3 $3
B 5 $5

I would like to expand this per group so that I get all the times in between the min and max, and then fill downwards (assuming it's sorted by name then by time elapsed) to produce the following output

Name Time Elapsed Amount
A 2 $2
A 3 $2
A 4 $2
A 5 $1
A 6 $1
A 7 $6
B 3 $3
B 4 $3
B 5 $5

I tried the following:

df = df(df.set_index('Time Elapsed')
          .groupby('Name')['Amount']
          .apply(lambda x: x.reindex(range(x.index.min(), x.index.max() 1))
          .ffill().fillna(0)).reset_index)

and got a ValueError: cannot reindex on an axis with duplicate labels.

Feel free to offer a solution that does not use any of my code.

CodePudding user response:

try this:

def func(g: pd.DataFrame):
    tmp = g.set_index('Time Elapsed')
    res = tmp.reindex(
        np.arange(tmp.index.min(), tmp.index.max() 1),
        method='ffill')
    return res

grouped = df.groupby('Name', as_index=False)
result = grouped.apply(func).reset_index().reindex(columns=df.columns)
print(result)
>>>
Name    Time   Elapsed   Amount
0       A      2         $2
1       A      3         $2
2       A      4         $2
3       A      5         $1
4       A      6         $1
5       A      7         $6
6       B      3         $3
7       B      4         $3
8       B      5         $5

CodePudding user response:

One option is with complete from pyjanitor, to expose the missing rows:

# pip install pyjanitor
import pandas as pd
import janitor

# build a dictionary of all the possible times
# the key of the dictionary should be the column to be expanded
times = {"Time Elapsed" : lambda df: range(df.min(), df.max()   1)}
df.complete(times, by = 'Name').ffill()
  Name  Time Elapsed Amount
0    A             2     $2
1    A             3     $2
2    A             4     $2
3    A             5     $1
4    A             6     $1
5    A             7     $6
6    B             3     $3
7    B             4     $3
8    B             5     $5
  • Related