I'm trying to limit number of empty lines between blocks to .
EOL:
'\r'?
'\n'
'\r'?
'\n'?
SPACE*
Is there a way to match '\n' and '\r\n' as a one symbol, like '\R' (linebreak) in Perl 5. Or ignore '\r' completely?
CodePudding user response:
All Lexer rules result in a single "symbol" (in ANTLR they are referred to as tokens).
The typical way of match EOL in Antlr would be something like:
EOL: '\r?\n';
this would match an optional carriage return followed by a line feed.
This would match \r\n
or \n
It's pretty customary to put whitespace into a separate rule with a -> skip
directive or a -> channel(HIDDNE)
directive.
if you're trying to coalesce end of line whitespace and multiple blank lines, try:
EOL: '\r'? '\n' (' '* '\r'? '\n')*;
this will generate a token for trailing whitespace and any subsequent empty lines and their line terminators.
lexer grammar sample:
lexer grammar eol
;
ALPHA: [A-Za-z] ;
EOL: '\r'? '\n' (' '* '\r'? '\n')*;
sample input file:
ANB
G
KL
ZZ
➜ grun eol tokens -tokens < eol.txt
[@0,0:2='ANB',<ALPHA>,1:0]
[@1,3:6='\n\n\n\n',<EOL>,1:3]
[@2,7:7='G',<ALPHA>,5:0]
[@3,8:8='\n',<EOL>,5:1]
[@4,9:10='KL',<ALPHA>,6:0]
[@5,11:19='\n\n\n\n\n\n\n\n\n',<EOL>,6:2]
[@6,20:21='ZZ',<ALPHA>,15:0]
[@7,22:21='<EOF>',<EOF>,15:2]