I try to catch somes "blocks" of my text file that are endend by a pattern with several "=" symbols.
I want to catch all these block without the final pattern but it's made with "=" that is use on some capture group of my block ... So when i select them, the pattern is always in the last match ...
Do you no a method for exclude it ?
A extract of my regex :
(\d{2}-\d{2}-\d{4} \d{2}:\d{2}) (.*)(Statut)([,:. aA-zZ0-9À-ÖØ-öø-ÿ=><\n\r]*)\n
And block to analyse :
01-10-2021 16:02 utilisateur1Statut A réaliser =>
Ouverte
01-10-2021 16:03 utilisateur1Statut MyFile.txt
01-10-2021 16:04 utilisateur1Statut
utilisateur1 => utilisateur2
======================================================================
Warning : my block can be with one or more row with carriage return ...
Links to regex101 sample : https://regex101.com/r/hXu3QO/1
CodePudding user response:
The last part of the pattern contains a character class [,:. aA-zZ0-9À-ÖØ-öø-ÿ=><\n\r]
that also matches =
and newlines, so there is no rule to stop matching.
Note that aA-zZ
is not the same as [a-zA-Z]
You can exclude the newlines from the character class, and repeat the matching starting with a newline and all lines that do not start with for example ===
or \d{2}-
You can make the rule as specific as you want of course.
(\d{2}-\d{2}-\d{4} \d{2}:\d{2}) (.*?)(Statut)\s*([,:. a-zA-Z0-9À-ÖØ-öø-ÿ=><]*(?:\n(?!===|\d{2}-)[,:. a-zA-Z0-9À-ÖØ-öø-ÿ=><] )*)