replace every single string value with 0-CodePudding

given this dataframe, I want to replace every single string value in a column with the value 0. I tried this code but the dataframe remains unaffected. I am not sure how to change the parameters of this line

df['new'].replace(to_replace=r'^$', value=0, regex=True)

here is the whole code:

l2=['a. 12','b. 75','23', 'sc/a 34', '85', 'a 32', 'b 345']
d = {'col1': []}
df = pd.DataFrame(data=d)
df['col1']=l2

df['new'] = np.where(df["col1"].str.isnumeric(),df["col1"].str[:], (df["col1"].str.extract("^([a-z/]*)", expand=False)) )
print(df['new'])
df['new'].replace(to_replace=r'^$', value=0, regex=True)
print(df['new'])

so the dataframe column should have the following values:

0 0 23 0 85 0 0

CodePudding user response：

Your regex represent an empty string :

nothing between start (^) and end ($) of string.

You should use : .*

CodePudding user response：

Try to_replace='^', value=0, regex=True, inplace=True.

That is: if the regex matches then it must be a string, so replace with number 0.

Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html

PS If you want to strip non-numeric characters and all numbers are integers, then this might be what you want:

df['new'].replace(to_replace=r'\D ', value='', regex=True, inplace=True)

That is: in every value (all your values in the example are strings) replace any sequence of non-digits with the empty string.