Home > front end >  Pandas filter values with low number of decimal
Pandas filter values with low number of decimal

Time:05-20

I have a pandas dataframe:

df = pd.DataFrame({'start': [50, 100, 50000, 50030, 100000],
                   'end': [51, 101, 50001, 50031, 100001],
                   'value': [1.00, 2.1234567, 3.01, 4.12345, 5.456789]})

I would like to filter the values of columns 'value' and keep only values with decimal greater then two:

start   end        value
100     101        2.1234567
50030   50031      4.12345
100000  100001     5.456789

How to filter the column by decimal size?

CodePudding user response:

Use Series.astype with Series.str.split, Series.map and Series.gt:

Cast your df into str. Split the value column on . and pick the 2nd part. Then get the length of the decimal part. Pick the rows with length > 2.

In [639]: df[df['value'].astype(str).str.split('.').str[1].map(len).gt(2)]
Out[639]: 
    start     end     value
1     100     101  2.123457
3   50030   50031  4.123450
4  100000  100001  5.456789

CodePudding user response:

It is possible by converting to strings, but in real data because float accuracy this solutions should failed:

df = df[df['value'].astype(str).str.extract('.(\d )$', expand=False).str.len().gt(2)]
print (df)

    start     end     value
1     100     101  2.123457
3   50030   50031  4.123450
4  100000  100001  5.456789
  • Related