Home > Net >  Pandas shift logic
Pandas shift logic

Time:01-10

I have a dataframe like:

col1 customer
 1    a
 3    a
 1    b
 2    b
 3    b
 5    b

I want the logic to be like this:

col1 customer  col2
 1    a         1
 3    a         1
 1    b         1
 2    b         2
 3    b         3
 5    b         3

as you can see, if the customer has consistent values in col1, give it, if not, give the last consistent number which is 3

I tried using the df.shift() but I was stuck

Further Example:

col1
1
1
1
3
5
8
10

he should be given a value of 1 because that's the last consistent value for him!

CodePudding user response:

Update

If you have more than one month, you can use this version:

import numpy as np

inc_count = lambda x: np.where(x.diff(1) == 1, x, x.shift(fill_value=x.iloc[0]))
df['col2'] =  df.groupby('customer')['col1'].transform(inc_count)
print(df)

# Output
   col1 customer  col2
0     1        a     1
1     3        a     1
2     1        b     1
3     2        b     2
4     3        b     3
5     5        b     3

Maybe you want to increment a counter if the next row value following the current one:

# Same as df['col1'].diff().eq(1).cumsum().add(1)
df['col2'] = df['col1'].eq(df['col1'].shift() 1).cumsum().add(1)
print(df)

# Output
   col1 customer  col2
0     1        a     1
1     3        a     1
2     1        b     1
3     2        b     2
4     3        b     3
5     5        b     3

Or for each customer:

inc_count = lambda x: x.eq(x.shift() 1).cumsum().add(1)
df['col2'] = df['col2'] = df.groupby('customer')['col1'].transform(inc_count)
print(df)

# Output
   col1 customer  col2
0     1        a     1
1     3        a     1
2     1        b     1
3     2        b     2
4     3        b     3
5     5        b     3
  • Related