Home > OS >  Extract the first number from a string number range
Extract the first number from a string number range

Time:10-31

I have a dataset with price column as type of string ,and some of the values in the form of range (15000-20000). I want to extract the first number and convert the entire column to integers.

I tried this : df['ptice].apply(lambda x:x.split('-')[0]) The code just return the original column.

CodePudding user response:

Try one of the following options:

Data

import pandas as pd

data = {'price': ['0','100-200','200-300']}
df = pd.DataFrame(data)

print(df)

     price
0        0 # adding a str without `-`, to show that this one will be included too
1  100-200
2  200-300

Option 1

  • Use Series.str.split with expand=True and select the first column from the result.
  • Next, chain Series.astype, and assign the result to df['price'] to overwrite the original values.
df['price'] = df.price.str.split('-', expand=True)[0].astype(int)

print(df)

   price
0      0
1    100
2    200

Option 2

  • Use Series.str.extract with a regex pattern, r'(\d )-?':
  • \d matches a digit.
  • matches the digit 1 or more times.
  • match stops when we hit - (? specifies "if present at all").
data = {'price': ['0','100-200','200-300']}
df = pd.DataFrame(data)
df['price'] = df.price.str.extract(r'(\d )-?').astype(int)

# same result

CodePudding user response:

Here is one way to do this:

df['price'] = df['price'].str.split('-', expand=True)[0].astype('int')

This will only store first number from the range. Example: From 15000-20000 only 15000 will be stored in the price column.

  • Related