I would like to extract only the numbers contained in a string. Can isdigit()
and split()
be combined for this purpose or there is simpler/faster way?
Example:
m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']
Output:
numbers = [122, 35, 1052]
text = ['How to extract only number', 'The number must be extracted', 'must be extracted']
My code:
text = []
numbers = []
temp_numbers = []
for i in range(len(m)):
text.append([word for word in m[i].split() if not word.isdigit()])
temp_numbers.append([int(word) for word in m[i].split() if word.isdigit()])
for i in range(len(m)):
text[i] = ' '.join(text[i])
for elem in temp_numbers:
numbers.extend(elem)
print(text)
print(numbers)
CodePudding user response:
Import regex library:
import re
If you want to extract all digits:
numbers = []
texts = []
for string in m:
numbers.append(re.findall("\d ", string))
texts.append(re.sub("\d ", "", string).strip())
If you want to extract only first digit:
numbers = []
texts = []
for string in m:
numbers.append(re.findall("\d ", string)[0])
texts.append(re.sub("\d ", "", string).strip())
CodePudding user response:
So if we take m as a list you can just loop through it and check if the current char is a digit then if so append it.
For loop solution:
m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']
numbers = []
temp_num = ""
for string in m:
# Presuming m only contains strings
for char in string:
if char.isdigit():
temp_num = char
numbers.append(int(temp_num))
temp_num = ""
List comprehension solution - appends each number at different indexes:
m = ['How to extract only number 122', 'The number 35 must be extracted', '1052 must be extracted']
numbers = [int(char) for string in m for char in string if char.isdigit()]
Hope this helped, also if you want to only get the values of an iterable (e.g. a list) just use for varname in iterable
it's faster and cleaner.
If you need both index and the value, use for index, varname in enumerate(iterable)
.
CodePudding user response:
nums_list = []
m = ["How to extract only number 122", "The number 35 must be extracted", "1052 must be extracted"]
for i in m:
new_l = i.split(" ")
for j in new_l:
if j.isdigit():
nums_list.append(int(j))
print nums_list
OP:
[122, 35, 1052]