I have the following Pandas data frame in Python:
col1
----
A
B
NaN
A
A
NaN
NaN
B
C
I would like to replace the values so that all A
remain A
, all other values (B, C
in this example) are replaced with D
, and NaN
remain unchanged. What is the appropriate way to do it? So that the required output is:
col1
----
A
D
NaN
A
A
NaN
NaN
D
D
I have tried these so far:
df["col1"] = np.where(df["col1"] == "A", "A", "D")
, but this changed NaN
s to D
as well.
df["col1"].replace(["A", "B", "C"], ["A", "D", "D"])
seems better, but in my real scenario there are far more non-A
values that I want to change to D
, so exhaustive enumeration is problematic.
CodePudding user response:
Use boolean indexing to update the values that are not A nor NaN:
df.loc[df['col1'].ne('A')&df['col1'].notna(), 'col1'] = 'D'
Output:
col1
0 A
1 D
2 NaN
3 A
4 A
5 NaN
6 NaN
7 D
8 D
CodePudding user response:
Use Series.mask
:
df["col1"] = df["col1"].mask(df["col1"].ne("A") & df['col1'].notna(), "D")
print (df)
col1
0 A
1 D
2 NaN
3 A
4 A
5 NaN
6 NaN
7 D
8 D
Your solution should be changed:
df["col1"] = np.where(df["col1"].isna(), np.nan,
np.where(df["col1"].eq('A'), 'A', 'D'))
print (df)