I have a dataframe
df_in = pd.DataFrame([["A","X",5,4,1],["B","Y",3,3,1],["C","Y",4,7,4]], columns=['col1', 'col2', 'col3', 'col4','col5'])
I want to repeat a row n no. of times and the count also should increase from the no. present in col4.
Ex: I want to repeat the B row 3 times and count in col4 will increse from the current value present in col4 like 3,4 and 5. Similarly for C row repeat 2 times and increase the count in col4 from the current value.
Expected Output:
df_Out = pd.DataFrame([["A","X",5,4,1],["B","Y",3,3,1],["B","Y",3,4,1],["B","Y",3,5,1],["C","Y",4,7,4],["C","Y",4,8,4]], columns=['col1', 'col2', 'col3', 'col4','col5'])
How to do it?
CodePudding user response:
Create dictionary for number of repeating, map by Series.map
and if no match set 1
, then use Index.repeat
for index values with DataFrame.loc
for append rows, last add counter by GroupBy.cumcount
for col4
:
d = {'B':3, 'C':2}
df = df_in.loc[df_in.index.repeat(df_in['col1'].map(d).fillna(1))]
df['col4'] = df.groupby(level=0).cumcount()
df = df.reset_index(drop=True)
print (df)
col1 col2 col3 col4 col5
0 A X 5 4 1
1 B Y 3 3 1
2 B Y 3 4 1
3 B Y 3 5 1
4 C Y 4 7 4
5 C Y 4 8 4