I have a pandas dataframe:
df = pd.DataFrame({'start': [50, 100, 50000, 50030, 100000],
'end': [51, 101, 50001, 50031, 100001],
'value': [1.00, 2.1234567, 3.01, 4.12345, 5.456789]})
I would like to filter the values of columns 'value' and keep only values with decimal greater then two:
start end value
100 101 2.1234567
50030 50031 4.12345
100000 100001 5.456789
How to filter the column by decimal size?
CodePudding user response:
Use Series.astype
with Series.str.split
, Series.map
and Series.gt
:
Cast your df
into str
.
Split the value
column on .
and pick the 2nd part.
Then get the length of the decimal part.
Pick the rows with length > 2.
In [639]: df[df['value'].astype(str).str.split('.').str[1].map(len).gt(2)]
Out[639]:
start end value
1 100 101 2.123457
3 50030 50031 4.123450
4 100000 100001 5.456789
CodePudding user response:
It is possible by converting to strings, but in real data because float accuracy this solutions should failed:
df = df[df['value'].astype(str).str.extract('.(\d )$', expand=False).str.len().gt(2)]
print (df)
start end value
1 100 101 2.123457
3 50030 50031 4.123450
4 100000 100001 5.456789