I have a data frame in which one of the columns (dtype = float64) has a set of values such as:
129.0
nan
100.0
87.0
40.0
344.992
130.0
101.0
227.0
147.0
190.0
83.0
-144.63542183979368
I wish to replace all the values with more than one decimal places to nan. The values with more than one decimal places, both positive and negative are actually junk values. Only the positive values with single decimal place ending with .0 are genuine.
So, in above case, -144.63542183979368 and 344.992 should be replaced with nan. The modified data frame column should become like this:
129.0
nan
100.0
87.0
40.0
nan
130.0
101.0
227.0
147.0
190.0
83.0
nan
How do I go about doing this?
At the end, after removing junk float values, I would like to change the dtype to integer (which can be done once the improper float values are removed).
CodePudding user response:
Try This
import pandas as pd
import numpy as np
arr = [129.0 ,np.nan ,100.0 ,87.0, 40.0, 344.992, 130.0, 101.0, 227.0, 147.0, 190.0, 83.0, -144.63542183979368]
df = pd.DataFrame(arr, columns=['example'])
print(df)
def convert(row):
if row:
conv = row % 1
if conv > 0:
return np.nan
else:
return row
else:
return np.nan
df['example'] = df['example'].apply(convert)
print(df)