Home > Blockchain >  Pandas / How to insert variable number of lines inside a DataFrame?
Pandas / How to insert variable number of lines inside a DataFrame?

Time:09-20

Here is the structure of my dataframe

plan ADO_ver_x ADO_incr_x ADO_ver_y ADO_incr_y
3ABP3 25.0 4.0 25.0 7.0

I would like to add ADO_incr_y - ADO_incr_x lines, which means in this case the result would be :

plan ADO_ver_x ADO_incr_x ADO_ver_y ADO_incr_y
3ABP3 25.0 4.0 25.0 5.0
3ABP3 25.0 5.0 25.0 6.0
3ABP3 25.0 6.0 25.0 7.0

Is there a Panda/Pythonic way to do that ?

I was thinking something like :

reps = [ val2-val1 for val2, val1 in zip(df_insert["ADO_incr_y"],df_insert["ADO_incr_x"]) ]
df_insert.loc[np.repeat(df.index_insert.values, reps)]

But I don't get the incremental progression : 4 -> 5, 5->-6, 6 -> 7

How can I get the index inside the list comprehension ?

CodePudding user response:

You can repeat the data, then modify with groupby.cumcount():

repeats = df['ADO_incr_y'].sub(df['ADO_incr_x']).astype(int)
out = df.reindex(df.index.repeat(repeats))

out['ADO_incr_x']  = out.groupby(level=0).cumcount()
out['ADO_incr_y'] = out['ADOE_incr_x']   1
  • Related