I am trying to extract numbers from a text using the code below:
>>> import re
>>> text="Technical Details Item Weight381 g Product Dimensions29.8 x 8.2 x 5.4 cm Best Sellers Rank 1,239,929 in Toys & Games (See top 100)"
>>> re.findall('Best Sellers Rank (. ?) in',str(text))
['1,239,929']
The output of the code is saved as a list. However, my goal is to extract the number as a numeric object (i.e. 1239929).
CodePudding user response:
You almost there
import re
text="Technical Details Item Weight381 g Product Dimensions29.8 x 8.2 x 5.4 cm Best Sellers Rank 1,239,929 in Toys & Games (See top 100)"
print(re.findall('Best Sellers Rank (. ?) in',str(text))[0])
Gives #
1,239,929
If you want to be a single number it is possible a duplicate as @mkrieger1 mentioned apply rest method How to convert a string to a number if it has commas in it as thousands separators?
CodePudding user response:
import re
text="Technical Details Item Weight381 g Product Dimensions29.8 x 8.2 x 5.4 cm Best Sellers Rank 1,239,929 in Toys & Games (See top 100)"
If you want a single integer then:
int(''.join(re.findall('Best Sellers Rank (. ?) in',str(text))[0].split(',')))
#output
1239929
CodePudding user response:
Combining the suggestions of mkrieger1 and Bhargav. This uses the first match, if you want to support more, you'll have to iterate
import re, locale
locale.setlocale( locale.LC_ALL, 'en_US.UTF-8')
text = "Technical Details Item Weight381 g Product Dimensions29.8 x 8.2 x 5.4 cm Best Sellers Rank 1,239,929 in Toys & Games (See top 100)"
print(locale.atoi(re.findall('Best Sellers Rank (. ?) in', text)[0]))
Output:
1239929