Pandas / How to insert variable number of lines inside a DataFrame?-CodePudding

Here is the structure of my dataframe

plan	ADO_ver_x	ADO_incr_x	ADO_ver_y	ADO_incr_y
3ABP3	25.0	4.0	25.0	7.0

I would like to add ADO_incr_y - ADO_incr_x lines, which means in this case the result would be :

plan	ADO_ver_x	ADO_incr_x	ADO_ver_y	ADO_incr_y
3ABP3	25.0	4.0	25.0	5.0
3ABP3	25.0	5.0	25.0	6.0
3ABP3	25.0	6.0	25.0	7.0

Is there a Panda/Pythonic way to do that ?

I was thinking something like :

reps = [ val2-val1 for val2, val1 in zip(df_insert["ADO_incr_y"],df_insert["ADO_incr_x"]) ]
df_insert.loc[np.repeat(df.index_insert.values, reps)]

But I don't get the incremental progression : 4 -> 5, 5->-6, 6 -> 7

How can I get the index inside the list comprehension ?

CodePudding user response：

You can repeat the data, then modify with groupby.cumcount():

repeats = df['ADO_incr_y'].sub(df['ADO_incr_x']).astype(int)
out = df.reindex(df.index.repeat(repeats))

out['ADO_incr_x']  = out.groupby(level=0).cumcount()
out['ADO_incr_y'] = out['ADOE_incr_x']   1