Home > Enterprise >  how do I find a 10-digit number according to the first three numbers in the text? Python
how do I find a 10-digit number according to the first three numbers in the text? Python

Time:10-06

I need to find all 10 digit numbers in the text starting with a certain number series. There is a example:

a_string = "Some text 6401104219 and 6401104202 and 2201104202"

matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]

result is: 6401104219, 6401104202

CodePudding user response:

You can use regular expressions and str.startswith:

import re

result = [s for s in re.findall(r"\d{10}", a_string) if any(map(s.startswith, matches))]
# ['6401104219', '6401104202']

If you know the prefixes are all 3 digits long, you can do better:

matches = set(matches)

result = [s for s in re.findall(r"\d{10}", a_string) if s[:3] in matches]

You will have to change the regex to r"\b(\d{10})\b" if you want to exclude possible 10-digit prefixes of longer numbers.

CodePudding user response:

You can regex.

  • Find all the 10-digit numbers
  • Filter out the numbers which starts from the given elements in matches list

The code :

import re

a_string = "Some text 6401104219 and 6401104202 and 2201104202"

matches = ["240", "880", "898", "910", "920",
           "960", "209", "309", "409", "471", "640"]

match = re.findall(r'\d{10}', a_string)  # finding all the 10 digit numbers

# filtering out the numbers which starts from the given elements in matches


ans = [i for i in match if any(map(i.startswith, matches))]
# OR 
# ans = [i for i in match if i[:3] in matches] # if lenght is 3 only then simply check its existence in list
print(ans)
# ['6401104219', '6401104202'] 

CodePudding user response:

a_string = "Some text 6401104219 and 6401104202 and 2201104202 and    640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]

a_string_list=a_string.split(' ')
for i in a_string_list:
    for j in matches:
        if i.startswith(j) and len(i)==10:
            print(i)
            break

CodePudding user response:

You can directly use re.

import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and    640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
result = re.findall(r"\b(?:"   r"|".join(matches) r")\d{7}\b", a_string)

print(result)
# ['6401104219', '6401104202']
  • Related