Flutter:
Framework • revision 18116933e7 (vor 8 Wochen) • 2021-10-15 10:46:35 -0700
Engine • revision d3ea636dc5
Tools • Dart 2.14.4
Antrl4:
antlr4: ^4.9.3
I would like to implement a simple tool that formats text like in the following definition: https://www.motoslave.net/sugarcube/2/docs/#markup-style
So basically each __ is the start of an underlined text and the next __ is the end.
I got some issues with the following input:
^^subscript=^^
Shell: line 1:13 token recognition error at '^'
Shell: line 1:14 extraneous input '' expecting {'==', '//', '''', '__', '~~', '^^', TEXT}
MyLexer.g4:
STRIKETHROUGH : '==';
EMPHASIS : '//';
STRONG : '\'\'';
UNDERLINE : '__';
SUPERSCRIPT : '~~';
SUBSCRIPT : '^^';
TEXT
: ( ~[<[$=/'_^~] | '<' ~'<' | '=' ~'=' | '/' ~'/' | '\'' ~'\'' | '_' ~'_' | '~' ~'~' | '^' ~'^' )
;
MyParser.g4:
options {
tokenVocab=SugarCubeLexer;
//language=Dart;
}
parse
: block EOF
;
block
: statement*
;
statement
: strikethroughStyle
| emphasisStyle
| strongStyle
| underlineStyle
| superscriptStyle
| subscriptStyle
| unstyledStatement
;
unstyledStatement
: plaintext
;
strikethroughStyle
: STRIKETHROUGH (emphasisStyle | strongStyle | underlineStyle | superscriptStyle | subscriptStyle | unstyledStatement)* STRIKETHROUGH
;
emphasisStyle
: EMPHASIS (strikethroughStyle | strongStyle | underlineStyle | superscriptStyle | subscriptStyle | unstyledStatement)* EMPHASIS
;
strongStyle
: STRONG (strikethroughStyle | emphasisStyle | underlineStyle | superscriptStyle | subscriptStyle | unstyledStatement)* STRONG
;
underlineStyle
: UNDERLINE (strikethroughStyle | emphasisStyle | strongStyle | superscriptStyle | subscriptStyle | unstyledStatement)* UNDERLINE
;
superscriptStyle
: SUPERSCRIPT (strikethroughStyle | emphasisStyle | strongStyle | underlineStyle | subscriptStyle | unstyledStatement)* SUPERSCRIPT
;
subscriptStyle
: SUBSCRIPT (strikethroughStyle | emphasisStyle | strongStyle | underlineStyle | superscriptStyle | unstyledStatement)* SUBSCRIPT
;
plaintext
: TEXT
;
I would be super happy for any help. Thanks
CodePudding user response:
It's you TEXT
rule:
TEXT
: (
~[<[$=/'_^~]
| '<' ~'<'
| '=' ~'='
| '/' ~'/'
| '\'' ~'\''
| '_' ~'_'
| '~' ~'~'
| '^' ~'^'
)
;
You can't write a Lexer rule in ANTLR like you're trying to do (i.e. a '^' unless it's followed by another '^'). The ~'^'
means "any character that's not ^
")
if you run your input through grun
with a -tokens
option, you'll see that the TEXT
token pulls everything through the EOL
[@0,0:1='^^',<'^^'>,1:0]
[@1,2:14='subscript=^^\n',<TEXT>,1:2]
[@2,15:14='<EOF>',<EOF>,2:0]
Try something like this:
grammar MyParser
;
parse: block EOF;
block: statement*;
statement
: STRIKETHROUGH statement STRIKETHROUGH # Strikethrough
| EMPHASIS statement EMPHASIS # Emphasis
| STRONG statement STRONG # Strong
| UNDERLINE statement UNDERLINE # Underline
| SUPERSCRIPT statement SUPERSCRIPT # SuperScript
| SUBSCRIPT statement SUBSCRIPT # Subscript
| plaintext # unstyledStatement
;
plaintext: TEXT ;
STRIKETHROUGH: '==';
EMPHASIS: '//';
STRONG: '\'\'';
UNDERLINE: '__';
SUPERSCRIPT: '~~';
SUBSCRIPT: '^^';
TEXT: .;