Home > Net >  Can find match with regex
Can find match with regex

Time:12-04

Hi I'm trying to find line start with "CGK / WIII" but just can find the the first line?

What's wrong with my text? (it is rendered from a pdf file)

Mytext

I am coding with Python to extract data from pdf invoice to dataframe with invoice2data package, and face an error with one text rendered from one pdf file.

First I tried with regex: \w{3}\s\/[\s\w{4}]* and found out that it just can find 1 line.

Then I also tried with fix text "CGK / WIII" should found 4 match. But it's NOT.

I think there are font differences in my text but not sure.

CodePudding user response:

When I turn on global - Don't return after the first match in your linked example, it shows 4 matches.

Also you can not use quantifiers {4} inside a character set (inside []).

I'd do it like this:
\w{3}\s/\s\w{4}

  • Related