In my data there is a column kilometer whose values is showing as "29261..". Using regular expression i need to remove the double dot (..). I tried the below code however i could not get the solution. Below is the code for your reference:
df=[{
"UNIQUESERIALNO":"abcd123",
"Kilometer":"29261.."
}]
df=pd.DataFrame.from_dict(df)
df['Kilometer'].replace(regex=True, inplace=True, to_replace=r'[^0-9.\*.\*]', value=r'')
print(df)
CodePudding user response:
You're almost there. Try this:
import pandas as pd
df=[{
"UNIQUESERIALNO":"abcd123",
"Kilometer":"29261.."
}]
df=pd.DataFrame.from_dict(df)
df['Kilometer'].replace(regex=True, inplace=True, to_replace=r'\.{2}', value=r'')
print(df)
Output:
UNIQUESERIALNO Kilometer
0 abcd123 29261
CodePudding user response:
Using str.replace and you need to edit the specific column
import pandas as pd
df=[{
"UNIQUESERIALNO":"abcd123",
"Kilometer":"29261.."
}]
df=pd.DataFrame.from_dict(df)
df['Kilometer'] = df.Kilometer.str.replace('.', '')
print(df)
CodePudding user response:
If you want to replace those two periods in that specific context (at the end of the line after a decimal digit), you can specify the look-behind regex:
df['Kilometer'].replace('(?<=\d)\.\.$', # (?<=\d) means "after a digit"
'', regex=True)