Javascript regex to match JSDoc tags inside a documentation block-CodePudding

So right now I can isolate each JSDoc (ish) block I have in my code, for example I have this block here

/// @tag {type} name description
/// description continue
/// another description line
/// @tag {type} name description
/// description continue
/// @tag name description just one line.
/// @tag {type} name

I want to use a Regex expression to match each tag inside the block a tag needs to have a @name format then it can have a type {type} (optional) then it requires a name and finally it can have a description.. the description can be both multiple lined or single lined (and is also optional). So

the Regex I have came up with was:

^\/{3} @(?<tag>\w )(?:[ \t] {(?<type>.*)})?(?:[ \t] (?<name>\w ))(?:[ \t] (?<desc>[\s\S]*))?

my problem is with the description as soon as I hit the description it doesn't stop at the start of the next tag... I get the feeling that right now I'm using a greedy approach but I cannot find a wait to make it non greedy.

So the example above matches:

tag: tag

name: name

description:

description
/// description continue
/// another description line
/// @tag {type} name description
/// description continue
/// @tag name description just one line.
/// @tag {type} name

I wanted the description to stop just as the new tag starts or if the block ends

CodePudding user response：

Use

/^\/{3} @(?<tag>\w )(?:[ \t] {(?<type>[^{}]*)})?[ \t] (?<name>\w )(?:[ \t] (?<desc>.*(?:\n(?!\/{3} @\w).*)*))?/gm

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  \/{3}                    '/' (3 times)
--------------------------------------------------------------------------------
   @                       ' @'
--------------------------------------------------------------------------------
  (?<tag>                  group and capture to \k<tag>:
--------------------------------------------------------------------------------
    \w                       word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<tag>
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    [ \t]                    any character of: ' ', '\t' (tab) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    {                        '{'
--------------------------------------------------------------------------------
    (?<type>                group and capture to \k<type>:
--------------------------------------------------------------------------------
      [^{}]*                   any character except: '{', '}' (0 or
                               more times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \k<type>
--------------------------------------------------------------------------------
    }                        '}'
--------------------------------------------------------------------------------
  )?                       end of grouping
--------------------------------------------------------------------------------
  [ \t]                    any character of: ' ', '\t' (tab) (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?<name>                 group and capture to \k<name>:
--------------------------------------------------------------------------------
    \w                       word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \k<name>
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    [ \t]                    any character of: ' ', '\t' (tab) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    (?<desc>                 group and capture to \k<desc>:
--------------------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      (?:                      group, but do not capture (0 or more
                               times (matching the most amount
                               possible)):
--------------------------------------------------------------------------------
        \n                       '\n' (newline)
--------------------------------------------------------------------------------
        (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
          \/{3}                    '/' (3 times)
--------------------------------------------------------------------------------
           @                       ' @'
--------------------------------------------------------------------------------
          \w                       word characters (a-z, A-Z, 0-9, _)
--------------------------------------------------------------------------------
        )                        end of look-ahead
--------------------------------------------------------------------------------
        .*                       any character except \n (0 or more
                                 times (matching the most amount
                                 possible))
--------------------------------------------------------------------------------
      )*                       end of grouping
--------------------------------------------------------------------------------
    )                        end of \k<desc>
--------------------------------------------------------------------------------
  )?                       end of grouping

CodePudding user response：

Yep, the problem is that the description matcher is greedy. Changing * to *? to make it non-greedy fixes it. But it still has the problem of knowing when to stop. You can do that by checking if the input is over, or if there is a /// @ ahead. Note that this includes the /// at the start of each description line: I don't think it's possible to filter them out directly in the regex, so you'd have to post-process the output to remove ///s in desc.

/^\/{3} @(?<tag>\w )(?:[ \t] {(?<type>.*)})?(?:[ \t] (?<name>\w ))((?:[ \t] (?<desc>[\s\S]*?)((?=\/\/\/ @)|\s*\z)))?/gm