Conditional binary replacement in Pandas column with NaNs-CodePudding

I have the following Pandas data frame in Python:

col1
----
A
B
NaN
A
A
NaN
NaN
B
C

I would like to replace the values so that all A remain A, all other values (B, C in this example) are replaced with D, and NaN remain unchanged. What is the appropriate way to do it? So that the required output is:

col1
----
A
D
NaN
A
A
NaN
NaN
D
D

I have tried these so far:

df["col1"] = np.where(df["col1"] == "A", "A", "D"), but this changed NaNs to D as well.

df["col1"].replace(["A", "B", "C"], ["A", "D", "D"]) seems better, but in my real scenario there are far more non-A values that I want to change to D, so exhaustive enumeration is problematic.

CodePudding user response：

Use boolean indexing to update the values that are not A nor NaN:

df.loc[df['col1'].ne('A')&df['col1'].notna(), 'col1'] = 'D'

Output:

  col1
0    A
1    D
2  NaN
3    A
4    A
5  NaN
6  NaN
7    D
8    D

CodePudding user response：

Use Series.mask:

df["col1"] = df["col1"].mask(df["col1"].ne("A") & df['col1'].notna(), "D")
print (df)
  col1
0    A
1    D
2  NaN
3    A
4    A
5  NaN
6  NaN
7    D
8    D

Your solution should be changed:

df["col1"] = np.where(df["col1"].isna(), np.nan,
             np.where(df["col1"].eq('A'), 'A', 'D'))
print (df)