How can I remove '$'
in a column values?
Example: I have a column with values like $40
, $23
, $35
,
I want to see those column values like 40
, 23
, 35
.
I have already tried:
DF_['H1 % Change'] = DF_['H1 % Change'].replace(r'$', '')
This didn't work!
CodePudding user response:
Series.replace
replaces all values in the column that exactly match the first argument. You want Series.str.replace
to replace a substring:
DF_['H1 % Change'] = DF_['H1 % Change'].str.replace(r'$', '')
CodePudding user response:
You can iterate over all elements in you vector:
DF_['H1 % Change'] = [x.replace(r'$', '') for x in DF_['H1 % Change'] ]
This will replace elements one-by-one
CodePudding user response:
You can still do it with .replace()
with supplying the parameter regex=
, as follows:
DF_['H1 % Change'] = DF_['H1 % Change'].replace(r'\$', '', regex=True)
Demo
Data Input
DF_ = pd.DataFrame({'H1 % Change': ['$40', '$23', '$35']})
print(DF_)
H1 % Change
0 $40
1 $23
2 $35
Output
DF_['H1 % Change'] = DF_['H1 % Change'].replace(r'\$', '', regex=True)
print(DF_)
H1 % Change
0 40
1 23
2 35
Here, with specifying regex=True
, the first parameter is regarded as a regex. Hence, we need to quote the regex as \$
(with a backslash) in order to "escape" the $
symbol which, by coincident, is also a regex metacharacter (for end-of string anchor).
Detailed Explanation
Some more background information below for your reference:
Your code with .replace()
doesn't work because by default, regex=False
for .replace()
and with this setting, you can see from the official document (extracted below):
Parameters to_replace str, regex, list, dict, Series, int, float, or None
str: string exactly matching to_replace will be replaced with value
regex: regexs matching to_replace will be replaced with value
that when we supply the first parameter to_replace
with a string (with default regex=False
), it requires exact matching (full-match) string in order to replace. That is, if you specify $
, then no other characters are allowed in the testing string, in order to get a match.
When it is a regex (with regex=True
), testing string matching the regex (can be partial match) will be regarded as a match.
Alternate Solution
Another way to remove the dollar sign $
is by using .str.replace()
, which has by default regex=True
. Hence, we can do it simply by:
DF_['H1 % Change'] = DF_['H1 % Change'].str.replace(r'\$', '')
or simply:
DF_['H1 % Change'] = DF_['H1 % Change'].str.replace(r'$', '')
Here, it turns out that we can use both \$
(with backslash) and $
(without backslash) to achieve the same result.
For people using newer version of Pandas, you may get a warning message if you don't specify the parameter regex=
:
FutureWarning: The default value of regex will change from True to False in a future version.
Hence, it is recommended to quote also the parameter regex=
as well, like below:
DF_['H1 % Change'] = DF_['H1 % Change'].str.replace(r'\$', '', regex=True)