I am working with a dataset; below you can see a small example.
import pandas as pd
import numpy as np
data = {
'id':['9.','09', 9],
}
df = pd.DataFrame(data, columns = [
'id',])
df['id'] = df['id'].replace(".","")
df
In this data set, one number is written in three ways e.g '9.','09', 9
Now I want to have all of these numbers written in the same way e.g 9, without . or 0
I tried with this line of code but it is not working
df['id'] = df['id'].replace(".","")
So can anybody help me how to solve this problem?
CodePudding user response:
You could turn them all into doubles and then into integers
df['id'].astype('double').astype(int)
But this can be a problem for large numbers that exceed the 53 bit significand of the double. If the errant period is always at the end of the string, you could do
df['id'].astype(str).str.strip('.').astype(int)
or even a regex that strips the dot plus anything else to its right.
CodePudding user response:
try this:
pd.to_numeric(df['id'])
0 9.0
1 9.0
2 9.0
Name: id, dtype: float64
to int:
pd.to_numeric(df['id']).astype(int)
0 9
1 9
2 9
Name: id, dtype: int64