I would like to fill column b
of a dataframe with values from a
in case b
is nan
, and I would like to do it in a method chain, but I cannot figure out how to do this.
The following works
import numpy as np
import pandas as pd
df = pd.DataFrame(
{"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]}
)
df["b"] = df[["a", "b"]].ffill(axis=1)["b"]
print(df.to_markdown())
| | a | b | c |
|---:|----:|----:|:----|
| 0 | 1 | 10 | a |
| 1 | 2 | 2 | b |
| 2 | 3 | 3 | c |
| 3 | 4 | 40 | d |
but is not method-chained. Thanks a lot for the help!
CodePudding user response:
df = pd.DataFrame({"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]})
df['b'] = df.b.fillna(df.a)
| | a | b | c |
|---:|----:|----:|:----|
| 0 | 1 | 10 | a |
| 1 | 2 | 2 | b |
| 2 | 3 | 3 | c |
| 3 | 4 | 40 | d |
CodePudding user response:
One solution I have found is by using the pyjanitor library:
import pandas as pd
import pyjanitor
df = pd.DataFrame(
{"a": [1, 2, 3, 4], "b": [10, np.nan, np.nan, 40], "c": ["a", "b", "c", "d"]}
)
df.case_when(
lambda x: x["b"].isna(), lambda x: x["a"], lambda x: x["b"], column_name="b"
)
Here, the case_when(...)
can be integrated into a chain of manipulations and we still keep the whole dataframe in the chain.
I wonder how this could be accomplished without pyjanitor
.