Home > OS >  Extracting numbers (with decimal values) from a Column
Extracting numbers (with decimal values) from a Column

Time:02-17

I have a column with below values. I would like to extract these numbers (with decimals).

13
12
13.5 
420
0
9.75
007
7
11.27
.10
1776
...10
..11

If I am using df["x"] = df["x"].str.extract(r'([0-9] .*)'), the values which start with "." are being neglected from the result. Please note .10, ...10, ..11 should return 10 and 11 respectively

CodePudding user response:

It seems that you only need to strip the dots and convert to float:

df['x'].str.strip('.').astype(float)

Output:

0       13.00
1       12.00
2       13.50
3      420.00
4        0.00
5        9.75
6        7.00
7        7.00
8       11.27
9       10.00
10    1776.00
11      10.00
12      11.00
Name: x, dtype: float64

CodePudding user response:

To use your code, I just added strip('.') to remove leading '.', and fixed regex to ([1-9] .*|0\Z) so that it can return what you expected without leading zero, as follows:

import pandas as pd

df = pd.DataFrame({
    'x': ['13', '12', '13.5', '420', '0', '9.75', '007', '7', '11.27', '.10', '1776', '...10', '..11']
})

df['x'] = df['x'].str.strip('.').str.extract('([1-9] .*|0\Z)')

print(df['x'])

Result:

0        13
1        12
2      13.5
3       420
4         0
5      9.75
6         7
7         7
8     11.27
9        10
10     1776
11       10
12       11
Name: x, dtype: object

  • Related