Home > Software engineering >  How to extract only integer part from a string in Python?
How to extract only integer part from a string in Python?

Time:08-30

I would like to extract only the numbers contained in a string. Can isdigit() and split() be combined for this purpose or there is simpler/faster way?

Example:

m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']

Output:

numbers = [122, 35, 1052]
text = ['How to extract only number', 'The number must be extracted', 'must be extracted']

My code:

text = []
numbers = []
temp_numbers = []
for i in range(len(m)):
    text.append([word for word in m[i].split() if not word.isdigit()])
    temp_numbers.append([int(word) for word in m[i].split() if word.isdigit()])
for i in range(len(m)):
    text[i] = ' '.join(text[i])
for elem in temp_numbers:
    numbers.extend(elem)

print(text)
print(numbers)

CodePudding user response:

Import regex library:

import re

If you want to extract all digits:

numbers = []
texts = []
for string in m:
    numbers.append(re.findall("\d ", string))
    texts.append(re.sub("\d ", "", string).strip())

If you want to extract only first digit:

   numbers = []
    texts = []
    for string in m:
        numbers.append(re.findall("\d ", string)[0])
        texts.append(re.sub("\d ", "", string).strip())

CodePudding user response:

So if we take m as a list you can just loop through it and check if the current char is a digit then if so append it.

For loop solution:

m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']

numbers = []
temp_num = ""

for string in m:
    # Presuming m only contains strings

    for char in string:
        if char.isdigit():
            temp_num  = char
    
    numbers.append(int(temp_num))
    temp_num = ""

List comprehension solution - appends each number at different indexes:

m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']

numbers = [int(char) for string in m for char in string if char.isdigit()]

Hope this helped, also if you want to only get the values of an iterable (e.g. a list) just use for varname in iterable it's faster and cleaner.

If you need both index and the value, use for index, varname in enumerate(iterable).

CodePudding user response:

nums_list = []
m = ["How to extract only number 122", "The number 35 must be extracted", "1052 must be extracted"]
for i in m:
    new_l = i.split(" ")
    for j in new_l:
        if j.isdigit():
            nums_list.append(int(j))
print nums_list

OP:

[122, 35, 1052]
  • Related