I am trying to create a regex to match several different dimensional patterns and their units.
Below are examples of the patterns. The numbers can be an integer, or a decimal. The dimensional separator can be: "x", "by" or "transverse by". The units are cm or mm
3.4 x 2 cm
3.4 by 2 cm
3.4 mm x 2.0 mm
3.4 cm x 2.0 cm x 2 cm
3.4 x 2 x 2.0 x 2 cm
3.4 cm x 2 cm transverse by 2.0 cm
3.4 transverse by 2.0 cm
3.4 mm transverse by 2.0 mm
4 cm
4.5 cm
So far I have the following:
(\d (\.\d |)\s?(x|by)\s?\d (\.\d |)(\s?(x|by)\s?\d*(\.?\d |))?) (cm|mm)
But it doesn't pick up "transverse by", 3.4 mm x 2.0 mm, or 3 mm
Thanks in advance!
CodePudding user response:
I'm assuming you are using JavaScript or a Perl-compatible library. The following Regex uses pretty common features, so you should be good with most.
See an example on regexr
((\d*\.)?\d (\s*(cm|mm))?\s*(x|(transverse )?by)\s*)*(\d*\.)?\d (\s*(cm|mm))?
Explanation
Let's start with the end:
(\d*\.)?\d (\s*(cm|mm))?
This will match a decimal value ((\d*\.)?\d
) followed by an optional unit (cm
or mm
).
This will match simple text like:
- 4 cm
- 4.5 cm
Next is the prefix expression:
((\d*\.)?\d (\s*(cm|mm))?\s*(x|(transverse )?by)\s*)*
This breaks down into two parts:
The same expression we used before to match a number with a unit:
(\d*\.)?\d (\s*(cm|mm))?
followed by some separator (e.g. " x ", " by " and " transverse by ") text:
\s*(x|(transverse )?by)\s*
Zero more prefixes are allowed, so you could match n-dimensional sizes:
- 4.5 cm x 55 mm by 105.3 transverse by 12.001 cm x 9 mm
If you want to constrain to 3-dimensional shapes, then instead of a *
you can use {,2}
:
((\d*\.)?\d (\s*(cm|mm))?\s*(x|(transverse )?by)\s*){,2}
CodePudding user response:
To get the full matches, you can write the pattern as:
^\d (?:\.\d )?(?:\s [cm]m)?(?:\s (?:x|(?:transverse\s )?by)\s \d(?:\.\d )?(?:\s [cm]m)?)*$
Explanation
^
Start of string\d (?:\.\d )?
Match 1 digits with an optional decimal part(?:\s [cm]m)?
(?:
Non capture group to repeat as a whole part with alternations\s
Match 1 whitespace chars(?:x|(?:transverse\s )?by)
Match eitherx
ortransverse by
orby
\s
Match 1 whitespace chars\d(?:\.\d )?
Match 1 digits with an optional decimal part(?:\s [cm]m)?
Optionally match 1 whitespace chars and eithercm
ormm
)*
Close the non capture group and optionally repeat$
End of string
See a regex demo.
If you also want to match decimals starting with a dot like .5 mm
^(\d*\.)?\d (?:\s [cm]m)?(?:\s (?:x|(?:transverse\s )?by)\s (\d*\.)?\d (?:\s [cm]m)?)*$