Home > Software design >  Regex: Best way to match all the entity types from México
Regex: Best way to match all the entity types from México

Time:03-25

I'm trying to write a regex to match the entity types from México.

It should match:

  • S.A.
  • SA
  • S DE RL DE CV
  • S. DE R.L. DE C.V.
  • S.A. DE C.V.
  • S.A.P.I DE C.V.
  • SAPI DE CV
  • SA DE CV
  • S.A.B DE CV
  • SA CV

I'm kinda stuck in the names that have "de Rablabla" and "de Caldsjsd" in the middle because my regex is matching the "de R" or "de C" and doesn't match "SA CV", "S.A.", "SA".

regex:

( s[^a-zA-Z]*((a[^a-zA-Z]*)|(a[^a-zA-Z]*p[^a-zA-Z]*i[^a-zA-Z]*)|(p[^a-zA-Z]*r[^a-zA-Z]*)|(a(  |\.)*b(  |\.)*)|(c(  |\.)*))*){0,}(de  ((r[^a-zA-Z]*(l[^a-zA-Z]*)*)|(c[^a-zA-Z]*(v[^a-zA-Z]*)*))){1,}

Is it possible to this regex and am I doing this the right way?

CodePudding user response:

You could try it in regex101

My try:

(^S\.?\s?(A\.?\s?)?B?(P\.?I\.?)?(\sDE)?(\sR\.?L\.?)?(\sDE)?(\sC\.?V\.?)?)

https://regex101.com/r/wkZVWw/1

CodePudding user response:

You could write a pattern with alternations allowing all the variations.

^S(?:\.?A\.?P\.?I|\.?(?:A\.?)?|\.A\.B)?(?:(?: DE R\.?L\.?)?(?: DE)? C\.?V\.?)?$

In parts, the pattern matches:

  • ^ Start of string
  • S Match literally (is in all the examples)
  • (?: Non capture group
    • \.?A\.?P\.?I Match A and P and I with optional dots
    • | Or
    • \.?(?:A\.?)? Match optional dot and optional A and dot
    • | Or
    • \.A\.B Match A.B.
  • )? Close the non capture group and make it optinal
  • (?: Non capture group
    • (?: DE R\.?L\.?)? Optionally match DE RL with optional dots for R and L
    • (?: DE)? Optionally match DE
    • C\.?V\.? Match CV with optional dots
  • )? Close the non capture group and make it optional
  • $ End of string

Regex demo

  • Related