Home > Enterprise >  Python Regex only finding some results [duplicate]
Python Regex only finding some results [duplicate]

Time:10-05

I'm trying to find all results of invoices in a document (e.g. INV-12345), but it is only showing 'INV-' and a lot of blank results when I paste. Any ideas?

import re
import pyperclip

invoiceRegex = re.compile(r'(INV-)?\d{4,6}')

text = pyperclip.paste()

extractedInvoice = invoiceRegex.findall(text)

allInvoices = []

for invoice in extractedInvoice:
    allInvoices.append(invoice)

results = '\n'.join(allInvoices)

pyperclip.copy(results)

CodePudding user response:

re.findall returns the content of the capturing group, if there is exactly one:

The result depends on the number of capturing groups in the pattern. If there are no groups, return a list of strings matching the whole pattern. If there is exactly one group, return a list of strings matching that group. If multiple groups are present, return a list of tuples of strings matching the groups. Non-capturing groups do not affect the form of the result.

So you can use the following regex instead:

invoiceRegex = re.compile(r'(?:INV-)?(\d{4,6})')
  • Related