Home > Software design >  Pandas Dataframe: For loop that adds a fixed integer if the value exists in previous rows
Pandas Dataframe: For loop that adds a fixed integer if the value exists in previous rows

Time:07-27

for the following dataframe

df = pd.DataFrame({'Rounds':[1000,1000,1000,1000,3000,3000,4000,5000,6000,6000]})

I would like to have a for loop that if the value already exists in previous rows, a fixed int, in this case 25, is added to the value and creates:

df = pd.DataFrame({'Rounds':[1000,1025,1050,1075,3000,3025,4000,5000,6000,6025]})

Initially I tried

for i in df.index:
    if df.iat[i,1] == df.iloc[i-1,1]:
        df.iat[i,1] = df.iat[i-1,1] 25

The problem is that it doesn't work for more than two similar values in a column and I would like to give column name "Rounds" instead of the index of column.

CodePudding user response:

You need groupby.cumcount:

df['Rounds']  = df.groupby('Rounds').cumcount()*25

output:

   Rounds
0    1000
1    1025
2    1050
3    1075
4    3000
5    3025
6    4000
7    5000
8    6000
9    6025

intermediate:

df.groupby('Rounds').cumcount()

0    0
1    1
2    2
3    3
4    0
5    1
6    0
7    0
8    0
9    1
dtype: int64

CodePudding user response:

Use groupby cumcount:

df["Rounds"]  = df.groupby(df["Rounds"]).cumcount() * 25
print(df)

Output

   Rounds
0    1000
1    1025
2    1050
3    1075
4    3000
5    3025
6    4000
7    5000
8    6000
9    6025
  • Related