Here is the structure of my dataframe
plan | ADO_ver_x | ADO_incr_x | ADO_ver_y | ADO_incr_y |
---|---|---|---|---|
3ABP3 | 25.0 | 4.0 | 25.0 | 7.0 |
I would like to add ADO_incr_y - ADO_incr_x
lines, which means in this case the result would be :
plan | ADO_ver_x | ADO_incr_x | ADO_ver_y | ADO_incr_y |
---|---|---|---|---|
3ABP3 | 25.0 | 4.0 | 25.0 | 5.0 |
3ABP3 | 25.0 | 5.0 | 25.0 | 6.0 |
3ABP3 | 25.0 | 6.0 | 25.0 | 7.0 |
Is there a Panda/Pythonic way to do that ?
I was thinking something like :
reps = [ val2-val1 for val2, val1 in zip(df_insert["ADO_incr_y"],df_insert["ADO_incr_x"]) ]
df_insert.loc[np.repeat(df.index_insert.values, reps)]
But I don't get the incremental progression : 4 -> 5, 5->-6, 6 -> 7
How can I get the index inside the list comprehension ?
CodePudding user response:
You can repeat the data, then modify with groupby.cumcount()
:
repeats = df['ADO_incr_y'].sub(df['ADO_incr_x']).astype(int)
out = df.reindex(df.index.repeat(repeats))
out['ADO_incr_x'] = out.groupby(level=0).cumcount()
out['ADO_incr_y'] = out['ADOE_incr_x'] 1