I have the following text and I want a regex matching the last page of each file: https://regex101.com/r/DmVnK7/1
The right Regex gives the following result:
A_File1_Page1
**A_File1_Page2**
A_File2_Page1
A_File2_Page2
**A_File2_Page3**
B_File1_Page1
B_File1_Page2
**B_File1_Page3**
B_File2_Page1
B_File2_Page2
B_File2_Page3
**B_File2_Page4**
C_File1_Page1
C_File1_Page2
C_File1_Page3
C_File1_Page4
**C_File1_Page5**
CodePudding user response:
Regular expression
/(^.*_Page)\d $(?!\r?\n\1\d $)/gm
Example
https://regex101.com/r/Q2Ymk2/1
Description
- 1st Capturing Group
(^.*_Page)
^
asserts position at start of a line.
matches any character (except for line terminators)*
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)_Page
matches the characters_Page
literally (case sensitive)
\d
matches a digit (equivalent to[0-9]
)$
asserts position at the end of a line- Negative Lookahead
(?!\r?\n\1\d $)
- Assert that the Regex below does not match
\r
matches a carriage return (ASCII 13)?
matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\n
matches a line-feed (newline) character (ASCII 10)\1
matches the same text as most recently matched by the 1st capturing group\d
matches a digit (equivalent to[0-9]
)$
asserts position at the end of a line
Global pattern flags
g
modifier: global. All matches (don't return after first match)m
modifier: multi line. Causes^
and$
to match the begin/end of each line (not only begin/end of string)
CodePudding user response:
Using regex, I think only get the last occurrence can be gleaned.
Mostly because there is no regex construct for counting.
If you need to count, match all pages (.*?Page\d )
then sort and unique.
If just getting the last page of each is enough, then this
(.*?Page)\d (?![\s\S]*\1)
https://regex101.com/r/iP3FcV/1
( .*? Page ) # (1)
\d
(?! [\s\S]* \1 )