How to choose row from other column in dataframe if row from the default column is NaN?-CodePudding

My dataframe consists of 3 columns. The thirth column is based on the first two columns. The default column is column 2. But if column 2 is NaN, then I want column 3 to be filled with column 1. I added the third line to conditions, but it does not seem to work.

This is the DataFrame:

df = pd.DataFrame(np.array([[np.nan, 1717], [1749, 1750], [1704, np.nan]]),
                   columns=['a', 'b'])

This is my code:

import numpy as np
import pandas as pd
conditions = [
    (df["b"] <= df["a"]), 
    df["b"] > df["a"],
    df["b"] == df["b"].isna()]

choices = [df["b"], df["a"], df["a"]]

df['c'] = np.select(conditions, choices, default=df["b"])
print(df)

This is my output:

           a            b      c
0        NaN         1749.0  1749.0
1        1717.0      1750.0  1717.0
2        1704.0      NaN     NaN

But I want c to be filled if a or b is filled. So this is the output I want:

           a            b      c
0        NaN         1749.0  1749.0
1        1717.0      1750.0  1717.0
2        1704.0      NaN     1704.0

CodePudding user response：

You just need to make a small change to your third condition. df["b"].isna() already returns True or False, so df["b"] == df["b"].isna() is actually checking to see if df["b"] evaluates to the same boolean (it doesn't).

Just remove the first half of the third condition.

import numpy as np
import pandas as pd
conditions = [
    (df["b"] <= df["a"]), 
    df["b"] > df["a"],
    df["b"].isna()]

choices = [df["b"], df["a"], df["a"]]

df['c'] = np.select(conditions, choices, default=df["b"])
print(df)

CodePudding user response：

This seems to work:

df = pd.DataFrame(np.array([[np.nan, 1717], [1749, 1750], [1704, np.nan]]),
               columns=['a', 'b'])

df['c'] = df.a

for i in range(len(df)):
    if df.a.iloc[i] == np.nan:
        df.c.iloc[i] = df.b.iloc[i]