Home > Enterprise >  How to method-chain `ffill(axis=1)` in a dataframe
How to method-chain `ffill(axis=1)` in a dataframe

Time:11-19

I would like to fill column b of a dataframe with values from a in case b is nan, and I would like to do it in a method chain, but I cannot figure out how to do this.

The following works

import numpy as np
import pandas as pd

df = pd.DataFrame(
    {"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]}
)
df["b"] = df[["a", "b"]].ffill(axis=1)["b"]
print(df.to_markdown())

|    |   a |   b | c   |
|---:|----:|----:|:----|
|  0 |   1 |  10 | a   |
|  1 |   2 |   2 | b   |
|  2 |   3 |   3 | c   |
|  3 |   4 |  40 | d   |

but is not method-chained. Thanks a lot for the help!

CodePudding user response:

df = pd.DataFrame({"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]})
df['b'] = df.b.fillna(df.a)
    
|    |   a |   b | c   |
|---:|----:|----:|:----|
|  0 |   1 |  10 | a   |
|  1 |   2 |   2 | b   |
|  2 |   3 |   3 | c   |
|  3 |   4 |  40 | d   |

CodePudding user response:

One solution I have found is by using the pyjanitor library:

import pandas as pd
import pyjanitor 

df = pd.DataFrame(
    {"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]}
)
df.case_when(
    lambda x: x["b"].isna(), lambda x: x["a"], lambda x: x["b"], column_name="b"
)

Here, the case_when(...) can be integrated into a chain of manipulations and we still keep the whole dataframe in the chain.

I wonder how this could be accomplished without pyjanitor.

  • Related