Home > Software engineering >  Checking for specific value change between columns in pandas
Checking for specific value change between columns in pandas

Time:05-11

I've got 4 columns with numeric values between 1 and 4, and I'm trying to see which rows change from a value of 1 to a value of 4 progressing from column a to column d within those 4 columns. Currently I'm pulling the difference between each of the columns and looking for a value of 3. Is there a better way to do this?

Here's what I'm looking for (with 0's in place of nan):

    ID  a  b  c  d  check
    1   1  0  1  4  True
    2   1  0  1  1  False
    3   1  1  1  4  True
    4   1  3  3  4  True
    5   0  0  1  4  True
    6   1  2  3  3  False
    7   1  0  0  4  True
    8   1  4  4  4  True
    9   1  4  3  4  True 
   10   1  4  1  1  True

CodePudding user response:

You can just do cummax

col = ['a','b','c','d']
s = df[col].cummax(1)
df['new'] = s[col[:3]].eq(1).any(1) & s[col[-1]].eq(4)
Out[523]: 
0     True
1    False
2     True
3     True
4     True
5    False
6     True
7     True
8     True
dtype: bool

CodePudding user response:

You can try compare the index of 4 and 1 in apply

cols = ['a', 'b', 'c', 'd']

def get_index(lst, num):
    return lst.index(num) if num in lst else -1

df['Check'] = df[cols].apply(lambda row: get_index(row.tolist(), 4) > get_index(row.tolist(), 1), axis=1)
print(df)

   ID  a  b  c  d  check  Check
0   1  1  0  1  4   True   True
1   2  1  0  1  1  False  False
2   3  1  1  1  4   True   True
3   4  1  3  3  4   True   True
4   5  0  0  1  4   True   True
5   6  1  2  3  3  False  False
6   7  1  0  0  4   True   True
7   8  1  4  4  4   True   True
8   9  1  4  3  4   True   True
  • Related