I have the following setup:
data = pd.DataFrame({
"a": ["Amount 3,7$", "Amount 6,7$", "Amount 1,3$"],
"b": ["2,3", " Amount 3,5", "4,7"]
})
I want to find any cells with the substring Amount
and replace the character ,
with .
I expect:
a b
0 Amount 3.7$ 2,3
1 Amount 6.7$ Amount 3.5
2 Amount 1.3$ 4,7
I have tried:
for column in data.columns:
data[column] = data[column][data[column].str.contains("Amount", na=False)].replace(",", ".", regex=True)
What I get is:
a b
0 Amount 3.7$ NaN
1 Amount 6.7$ NaN
2 Amount 1.3$ NaN
Somehow the mask is not working as expected
CodePudding user response:
One approach:
data["b"] = np.where(data["b"].str.contains("Amount"), data["b"].str.replace(",", "."), data["b"])
print(data)
Output
a b
0 Amount 3,7$ 2,3
1 Amount 6,7$ Amount 3.5
2 Amount 1,3$ 4,7
An alternative is to use the following list comprehension:
data["b"] = [x.replace(",", ".") if "Amount" in x else x for x in data["b"]]
If "Amount"
is always at the beginning, you could use a more elaborated regex, as below:
data["b"] = data["b"].str.replace("(.*Amount.*)(,)", r"\1.", regex=True)
CodePudding user response:
applymap
data.applymap(lambda s: s.replace(',', '.') if 'Amount' in s else s)
a b
0 Amount 3.7$ 2,3
1 Amount 6.7$ Amount 3.5
2 Amount 1.3$ 4,7