Home > Back-end >  Pandas add a column of value of next row in another column (per group)
Pandas add a column of value of next row in another column (per group)

Time:11-08

I have the dataframe:

df = batch Code
      a     100
      a     120
      a     130
      a     120 
      b     140
      b     150
      c     100

I want to add a column 'add_code' that will be the value of the column 'Code' from the next row, per batch. So the output will be:

df = batch Code next_code
      a     100    120
      a     120    130
      a     130    120
      a     120    END
      b     140    150
      b     150    END
      c     100    END

What is the best way to do it?

CodePudding user response:

Use DataFrameGroupBy.shift with fill_value parameter:

df['next_code'] = df.groupby('batch')['Code'].shift(-1, fill_value='END')
print (df)
  batch  Code next_code
0     a   100       120
1     a   120       130
2     a   130       120
3     a   120       END
4     b   140       150
5     b   150       END
6     c   100       END

Or with Series.fillna for old pandas versions:

df['next_code'] = df.groupby('batch')['Code'].shift(-1).fillna('END')
print (df)
  batch  Code next_code
0     a   100     120.0
1     a   120     130.0
2     a   130     120.0
3     a   120       END
4     b   140     150.0
5     b   150       END
6     c   100       END

CodePudding user response:

You can try to solve this problem using groupby with apply method.

import pandas as pd
def next_code(x):
    return x.shift(-1).fillna('END')
df = pd.DataFrame({
    "batch" : ['a','a','a','a','b','b','c'],
    "code" : [100,120,130,120,140,150,100]
})
df['next_code'] = df.groupby(['batch'])['code'].apply(lambda x: next_code(x))
print(df)

Output:

  batch  code next_code
0     a   100     120.0
1     a   120     130.0
2     a   130     120.0
3     a   120       END
4     b   140     150.0
5     b   150       END
6     c   100       END
  • Related