I am attempting to replace all values in the row that have length > 0 with the first nonzero value. If the row has length 0, replace it with float 0.0
This is the expected input:
VOL1 VOL2 D
0 1 3
21 21
19 0 0
18 0
This is the expected output:
VOL1 VOL2 D
1 1 1
21 21 0.0
19 19 19
18 18 0.0
Thus far, this is what I have attempted:
import pandas as pd
import numpy as np
data = {
'VOL1':[0, 21, 19, 18],
'VOL2':[1, 21, 0, 0],
}
# Create DataFrame
df = pd.DataFrame(data)
df['D'] = [3,"",0,""]
#get first nonzero
first_nonzero_df = df[df!=0].cumsum(axis=1).min(axis=1)
if df.isnull().any(axis=1):
df.any(axis=1).replace(df, first_nonzero_df)
It's unclear to me what I'm doing wrong here, any help is appreciated. Thanks!
CodePudding user response:
IIUC, try:
>>> df.where(df!=0, df[df!=0].ffill(axis=1).bfill(axis=1)).replace("",0)
VOL1 VOL2 D
0 1 1 3.0
1 21 21 0.0
2 19 19 19.0
3 18 18 0.0
CodePudding user response:
import pandas as pd
data = {
'VOL1':[0, 21, 19, 18],
'VOL2':[1, 21, 0, 0],
}
# Create DataFrame
df = pd.DataFrame(data)
df['D'] = [None] * len(df)
first_nonzero_df = df[df!=0].cumsum(axis=1).min(axis=1)
keys = df.keys()
for i in range(len(df)):
for j in range(len(keys)):
if df[f'{keys[j]}'][i] == 0:
df[f'{keys[j]}'][i] = first_nonzero_df[i]
df = df.fillna(0)
df
Output: