I am trying to parse nested data types.
Examples:
ARRAY<STRING>
, MAP<STRING, INTEGER>
.
When I have an array, I use that regex ^ARRAY<(.*)>$
. Regex returns one group with a data type.
When I have a map, I use that regex ^MAP<(.*),(.*)>$
. Regex returns two groups - first key data type and second value data type.
But it does not work when I have Map as a key/value in another Map.
Examples:
MAP<MAP<STRING, STRING>, STRING>
The result should be MAP<STRING, STRING>
and STRING
ARRAY<MAP<ARRAY<STRING>, MAP<STRING, INTEGER>>>
The result should be MAP<ARRAY<STRING>, MAP<STRING, INTEGER>>
Is it possible to ignore nested matches? If yes - how?
CodePudding user response:
Depending on the variant of regex, you can create a sort of pseudo parser using named subroutines. For example, this one works with PHP's PCRE variant:
/
(?(DEFINE)
(?<TYPE> STRING | INTEGER | MAP<(?&TYPE),(?&TYPE)> | ARRAY<(?&TYPE)>))
^(?&TYPE)$
/x
The syntax varies based on the language, so you'd need to look it up.