I'm trying to get only h1
and h2
headers from markdown file using regex, but unfortunately I don't know regex well and can't write the correct solution.
With this expression I'm near the solution (I think so):
/\#{1,2} (.*?)(\\r\\n|\\r|\\n)/gm
But it returns also headers with more than two hashes.
Test case:
# first \r
## second \r
### third \r## fourth \r
This should return ['first', 'second', 'fourth']
CodePudding user response:
Use
/(?<!#)#{1,2} (.*?)(\\r(?:\\n)?|\\n)/gm
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
#{1,2} '#' (between 1 and 2 times (matching the
most amount possible))
--------------------------------------------------------------------------------
' '
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
r 'r'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
n 'n'
--------------------------------------------------------------------------------
) end of \2