My pandas dataframe looks as follows:
col1 | col2 |
---|---|
1 | ABC8392akl |
2 | 001523 |
3 | 000ABC58 |
Now I want to remove the leading zeroes, if the string is only numerical. Any suggestions? So outcome should be:
col1 | col2 |
---|---|
1 | ABC8392akl |
2 | 1523 |
3 | 000ABC58 |
CodePudding user response:
You can use a regex with str.replace
for this:
df['col2'] = df['col2'].str.replace(r'^0 (?!.*\D)', '', regex=True)
output:
col1 col2
0 1 ABC8392akl
1 2 1523
2 3 000ABC58
regex:
^0 # match leading zeros
(?!.*\D) # only if not followed at some point by a non digit character
variant
suggested by @timgeb
df['col2'] = df['col2'].str.replace(r'^0 (\d*)$', r'\1', regex=True)
regex:
^0 # match leading zeros
(\d*) # capture other digits (if any)
$ # match end of string
replace with captured digits (\1
)
CodePudding user response:
Use
where = (df['col2'].str.isdigit(), 'col2')
df.loc[where] = df.loc[where].str.lstrip('0')
CodePudding user response:
Is there a problem with this, or you want to keep the numbers as string?
df['col2'] = pd.to_numeric(df['col2'] ,errors='ignore')