Home > other >  How can i make group 1 differ based on content in the whole string?
How can i make group 1 differ based on content in the whole string?

Time:01-11

In our Python system, I'm trying to isolate the second part of a size to make sure i can save the values separately.

As i got data in tons of different ways i have to take a lot of scenarios into consideration! At the same time our system requires everything to be in group 1 to be identified correctly, which increases the complexity!

This is what i got so far:

(?<=[\/\-])\s*([A-Za-z] |\w ) ?(?!\d*\s*\)|\d*\)|\w*\))(?!\s*[\/\-] )

Examples

working

These are my examples working:

110/116
S/M
S / M
S/M(32-34)
110/116(10-12y)
110/116(S/M)

not working

However my regex only functions correctly on the above examples.

Following 7 are causing issues:

S/M / L /XL
S / M / L / XL
S/M / L/XL
S/M/L/XL
S/M/L/XL(30-32)
S/M / L/XL(30-32)
S/M / L / XL(30-32)

How can I capture those cases as in below table:

Case Input Expected capture in group 1
1 S/M / L /XL "L /XL"
2 S / M / L / XL "L / XL"
3 S/M / L/XL "L/XL"
4 S/M/L/XL "L/XL"
5 S/M/L/XL(30-32) "L/XL"
6 S/M / L/XL(30-32) "L/XL"
7 S/M / L / XL(30-32) "L / XL"

Issue

How can I capture a "/" in the middle including the whole part after (like /XL) but without any following parentheses (like not the (30/32)).

Example for S/M / L / XL(30-32) I want to capture L / XL only.

CodePudding user response:

You can use

(?<=[/-])\s*([A-Z] (?:\s*/\s*[A-Z] )?|\d )\b(?!\s*[/)-])

See the regex demo. Details:

  • (?<=[/-]) - a position immediately preceded with / or -
  • \s* - zero or more whitespaces
  • ([A-Z] (?:\s*/\s*[A-Z] )?|\d ) - Group 1: one or more uppercase letters, and then an optional sequence of a / char enclosed with zero or more whitespaces and then one or more uppercase letters, or one or more digits
  • \b - a word boundary
  • (?!\s*[/)-]) - immediately to the right of the current location, there can't be zero or more whitespaces and then either /, ) or -.
  •  Tags:  
  • Related