Home > Mobile >  Python Pandas DataFrames compare with next rows
Python Pandas DataFrames compare with next rows

Time:11-07

I have dataframe like this.

        col1  
    0     1
    1     3
    2     3
    3     1
    4     2
    5     3
    6     2
    7     2 

I want to create column out by compare each row. If row 0 less than row 1 then out is 1. If row 1 more than row 2 then out is 0. like this sample.

        col1  out  
    0     1   1     # 1<3 = 1
    1     3   0     # 3<3 = 0
    2     3   0     # 3<1 = 0
    3     1   1     # 1<2 = 1
    4     2   1     # 2<3 = 1
    5     3   0     # 3<2 = 0
    6     2   0     # 2<2 = 0
    7     2   - 

I try with this code.

    def comp_out(a):

        return np.concatenate(([1],a[1:] > a[2:]))
    
    df['out'] = comp_out(df.col1.values)

It show error like this.

ValueError: operands could not be broadcast together with shapes (11,) (10,) 

CodePudding user response:

Let's use shift instead to "shift" the column up so that rows are aligned with the previous, then use lt to compare less than and astype convert the booleans to 1/0:

df['out'] = df['col1'].lt(df['col1'].shift(-1)).astype(int)
   col1  out
0     1    1
1     3    0
2     3    0
3     1    1
4     2    1
5     3    0
6     2    0
7     2    0

We can strip the last value with iloc if needed:

df['out'] = df['col1'].lt(df['col1'].shift(-1)).iloc[:-1].astype(int)

df:

   col1  out
0     1  1.0
1     3  0.0
2     3  0.0
3     1  1.0
4     2  1.0
5     3  0.0
6     2  0.0
7     2  NaN

If we want to use the function we should make sure both are the same length, by slicing off the last value:

def comp_out(a):
    return np.concatenate([a[0:-1] < a[1:], [np.NAN]])


df['out'] = comp_out(df['col1'].to_numpy())

df:

   col1  out
0     1  1.0
1     3  0.0
2     3  0.0
3     1  1.0
4     2  1.0
5     3  0.0
6     2  0.0
7     2  NaN
  • Related