Groupby and assign operation result to each group-CodePudding

df = pd.DataFrame({'ID': ['A','A','A','A','A'],
                     'target': ['B','B','B','B','C'],
                     'length':[208,315,1987,3775,200],
                     'start':[139403,140668,141726,143705,108],
                     'end':[139609,140982,143711,147467,208]})

ID  target  length  start   end
0   A   B   208     139403  139609
1   A   B   315     140668  140982
2   A   B   1987    141726  143711
3   A   B   3775    143705  147467
4   A   C   200     108     208

If I perform the operation:

(df.assign(length=
           df['start'].lt(df['end'].shift())
           .mul(df['start']-df['end'].shift(fill_value=0))
           .add(df['length'])))

I get the correct result but how do I apply this logic to every group in a groupby?

for (a, b) in df.groupby(['start','end']):
    (df.assign(length=
           df['sstart'].lt(df['send'].shift())
           .mul(df['sstart']-df['send'].shift(fill_value=0))
           .add(df['length'])))

Leaves the dataframe unchanged?

CodePudding user response：

Group the df on required columns(ID and target) and shift the end column then apply your formula as usual:

s = df.groupby(['ID', 'target'])['end'].shift()
df['length'] = df['start'].lt(s) * df['start'].sub(s.fillna(0))   df['length']

  ID target  length   start     end
0  A      B   208.0  139403  139609
1  A      B   315.0  140668  140982
2  A      B  1987.0  141726  143711
3  A      B  3769.0  143705  147467
4  A      C   200.0     108     208