given this dataframe, I want to replace every single string value in a column with the value 0. I tried this code but the dataframe remains unaffected. I am not sure how to change the parameters of this line
df['new'].replace(to_replace=r'^$', value=0, regex=True)
here is the whole code:
l2=['a. 12','b. 75','23', 'sc/a 34', '85', 'a 32', 'b 345']
d = {'col1': []}
df = pd.DataFrame(data=d)
df['col1']=l2
df['new'] = np.where(df["col1"].str.isnumeric(),df["col1"].str[:], (df["col1"].str.extract("^([a-z/]*)", expand=False)) )
print(df['new'])
df['new'].replace(to_replace=r'^$', value=0, regex=True)
print(df['new'])
so the dataframe column should have the following values:
0 0 23 0 85 0 0
CodePudding user response:
Your regex represent an empty string :
nothing between start (^) and end ($) of string.
You should use : .*
CodePudding user response:
Try to_replace='^', value=0, regex=True, inplace=True
.
That is: if the regex matches then it must be a string, so replace with number 0.
Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html
PS If you want to strip non-numeric characters and all numbers are integers, then this might be what you want:
df['new'].replace(to_replace=r'\D ', value='', regex=True, inplace=True)
That is: in every value (all your values in the example are strings) replace any sequence of non-digits with the empty string.