Home > Software engineering >  pandas add group number use a loop
pandas add group number use a loop

Time:10-04

Giving the format:

df=pd.DataFrame({'Time_diff':[1400,1200,1000,1800,1200,1200,1200,1800,1200]})

There is a column "time_diff", I am trying to add group number in a new column "gourp_num", and the group number will increase one when it meets a condition : Time_diff > 1800. As shown below:

Time_diff group_num
1400 1
1200 1
1000 1
1800 2
1200 2
1200 2
1200 2
1800 3
1200 3

I wrote a loop but it doesn't work, what should I do?

a=1
for i in range(1,len(df)):
    if df[i]['time_diff'] < 1800:
        df[i]['group_num']=a
    else:
        a =1
        df[i]['group_num']=a

CodePudding user response:

Check for condition time_diff >= 1800 by .ge() and use .cumsum() to increment the count whenever the condition fulfills again down the series:

df['group_num'] = df['Time_diff'].ge(1800).cumsum()   1

Result:

print(df)

   Time_diff  group_num
0       1400          1
1       1200          1
2       1000          1
3       1800          2
4       1200          2
5       1200          2
6       1200          2
7       1800          3
8       1200          3

CodePudding user response:

I have modified your code to get the expected solution using for loop


import pandas as pd
df=pd.DataFrame({'Time_diff':[1400,1200,1000,1800,1200,1200,1200,1800,1200]})
a=1
group_num = []
for i, row in df.iterrows():
    if row['Time_diff'] < 1800:
        group_num.append(a)
    else:
        a =1
        group_num.append(a)
df['group_num']=group_num
print(df.to_string(index=False))  

output

Time_diff  group_num
      1400          1
      1200          1
      1000          1
      1800          2
      1200          2
      1200          2
      1200          2
      1800          3
      1200          3
> 
  • Related