I have this string:
-wordpagefound: 1 offerte 201135455 fam. gaudino, umbau wohnung, winterthur seite 3 von 17
rektifikat
projekt 7514505
pos.nr menge uberschrift artikelnummer richtpreis betrag
me bild artikelbeschreibung exkl. mwst exkl. mwst
dusche - wc eltern
How can I get the number right after -wordpagefound:
if I search for "wc"?
I need to get the page where it is found including new lines (for OCR purpose).
I tried to do this preg_match_all('/(-wordpagefound).*([0-9]).*('.$searchText.')/mi', $file->text, $matches, PREG_OFFSET_CAPTURE)
but apparently because of the new lines it doesn't work.
Thank you in advance!
CodePudding user response:
You can use
/-wordpagefound\D*(\d ).*?\bwc\b/si
/-wordpagefound\D*\K\d (?=.*?\bwc\b)/si
See the regex demo / regex demo #2.
Details:
-wordpagefound
- a fixed string\D*
- zero or more non-digits(\d )
- Group 1: one or more digits.*?
- any zero or more chars as few as possible\bwc\b
- a whole wordwc
.
The second regex is a variation of the first regex where \K
discards all text matched so far and the right regex part is enclosed into a positive lookahead to check for the pattern presence but exclude from match.