Pandas: get all positive delta values efficiently-CodePudding

I am looking for an efficient way to turn this pandas dataframe:

into

I only want "1" in a cell, if in the original dataframe the value jumps from "0" to "1". If it's the first row, I want a "1", if "1" is the start value. I have to use this operation often in my project and on a large dataframe, so it should be as efficient as possible. Thanks in advance!

CodePudding user response：

You can use:

df.diff().clip(0).fillna(df)

output:

CodePudding user response：

This code snippet should do exactly what you need:

import pandas as pd
df = pd.DataFrame({'A':[0,0,1,1,0],'B':[1,1,1,1,0],'C':[0,1,1,1,1]})
df.loc[-1] = len(df.columns)*[0]
df.index = df.index   1 
df.sort_index(inplace=True)
df = (df.diff() == 1)
df = df.astype(int)
df = df.iloc[1:]
print(df)

Output:

I am not sure, however, if this is efficient enough for you.

CodePudding user response：

Simple and efficient... Shift the dataframe and check for the change from 0 -> 1

m1 = df == 1
m2 = df.shift(fill_value=0) == 0

(m1 & m2) * 1

CodePudding user response：

Another possible solution:

df1 = df.shift(fill_value=0) 
1 * (df1.ne(df) & df1.ne(1))

Output: