Determinate Consecutive Values (Invoices) Pandas-CodePudding

I have a dataset with supplier and its invoices and I need to determinate which of the invoices are consecutives marking it with a 1 or a 0.

For example:

df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131']

   Supplier Invoice
0       ABC     101
1       ABC     102
2       ABC     105
3       ABC     109
4       ABC     110
5       ABC     114
6       ABC     120
7       ABC     121
8       ABC     122
9       ABC     130
10      ABC     131

And what I want is a third column like this:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1

Thanks in advance!

CodePudding user response：

Make sure your 'Invoice' column is of a numeric type first. Then you can use:

df1['Consecutive'] = (df1.Invoice - df1.Invoice.shift(1) == 1).astype(int)

prints:

   Supplier  Invoice  Consecutive
0       ABC      101            0
1       ABC      102            1
2       ABC      105            0
3       ABC      109            0
4       ABC      110            1
5       ABC      114            0
6       ABC      120            0
7       ABC      121            1
8       ABC      122            1
9       ABC      130            0
10      ABC      131            1

CodePudding user response：

using npwhere and diff

# take difference b/w consecutive rows using diff
# update to 0 when difference is not 1

df1['consecutive']=np.where(df1['Invoice'].astype(int).diff() !=1, 0, 1)
df1

Supplier    Invoice     consecutive
0   ABC     101     0
1   ABC     102     1
2   ABC     105     0
3   ABC     109     0
4   ABC     110     1
5   ABC     114     0
6   ABC     120     0
7   ABC     121     1
8   ABC     122     1
9   ABC     130     0
10  ABC     131     1

CodePudding user response：

df1['Consecutive']=df1.Invoice.astype(int).diff().eq(1).astype(int)

   Supplier Invoice  Consecutive
0       ABC     101            0
1       ABC     102            1
2       ABC     105            0
3       ABC     109            0
4       ABC     110            1
5       ABC     114            0
6       ABC     120            0
7       ABC     121            1
8       ABC     122            1
9       ABC     130            0
10      ABC     131            1