Home > Software engineering >  Pandas: get all positive delta values efficiently
Pandas: get all positive delta values efficiently

Time:10-21

I am looking for an efficient way to turn this pandas dataframe:

   A  B  C
0  0  1  0
1  0  1  1
2  1  1  1
3  1  1  0
4  0  0  1

into

   A  B  C
0  0  1  0
1  0  0  1
2  1  0  0
3  0  0  0
4  0  0  1

I only want "1" in a cell, if in the original dataframe the value jumps from "0" to "1". If it's the first row, I want a "1", if "1" is the start value. I have to use this operation often in my project and on a large dataframe, so it should be as efficient as possible. Thanks in advance!

CodePudding user response:

You can use:

df.diff().clip(0).fillna(df)

output:

   A  B  C
0  0  1  0
1  0  0  1
2  1  0  0
3  0  0  0
4  0  0  1

CodePudding user response:

This code snippet should do exactly what you need:

import pandas as pd
df = pd.DataFrame({'A':[0,0,1,1,0],'B':[1,1,1,1,0],'C':[0,1,1,1,1]})
df.loc[-1] = len(df.columns)*[0]
df.index = df.index   1 
df.sort_index(inplace=True)
df = (df.diff() == 1)
df = df.astype(int)
df = df.iloc[1:]
print(df)

Output:

   A  B  C
1  0  1  0
2  0  0  1
3  1  0  0
4  0  0  0
5  0  0  0

I am not sure, however, if this is efficient enough for you.

CodePudding user response:

Simple and efficient... Shift the dataframe and check for the change from 0 -> 1

m1 = df == 1
m2 = df.shift(fill_value=0) == 0

(m1 & m2) * 1

   A  B  C
0  0  1  0
1  0  0  1
2  1  0  0
3  0  0  0
4  0  0  1

CodePudding user response:

Another possible solution:

df1 = df.shift(fill_value=0) 
1 * (df1.ne(df) & df1.ne(1))

Output:

   A  B  C
0  0  1  0
1  0  0  1
2  1  0  0
3  0  0  0
4  0  0  1
  • Related