Home > other >  How can I exclude comma from the matched string?
How can I exclude comma from the matched string?

Time:11-09

I have a regex that matches strings between the commas and some more rules applied to it. When I use , to match, the comma is also included in the matched string.

I just wanted to tweak it a bit so it doesn't capture the commas and dots.

Regex

My Regex is - (?i)(\bmarche\s(?:\d{5})?\-?(?:\d{3,5})|(?:\d{5})\-?(?:\d{4})?\smarche\b|[,]marche[\.,]|marche$)

The strings which I'm using to match are -

Diabetologist,5,rue du marche,93160 Noisy le Grand,France.
Diabetologist,5,rue du lemarche,93160 Noisy le Grand,France.
Department of sea science,University Polytechnic of Marche,via Brecce Bianche,60100 Ancona,Italy.
ALL MATCH
Department of sea science,University Polytechnic of ,Marche,via Brecce Bianche,60100 Ancona,Italy.
Department of sea science,University Polytechnic of ,Marche 12345-1234,via Brecce Bianche,60100 
Department of sea science,University Polytechnic of ,Marche
Department of sea science,University Polytechnic of ,Marche.
Department of sea science,University Polytechnic of ,Marche 12345
Department of sea science,University Polytechnic of ,Marche 12345.
Department of sea science,University Polytechnic of ,12345-1234 Marche.
Department of sea science,University Polytechnic of ,12345 Marche.
Department of sea science,University Polytechnic of ,Marche 12345,Italy

I have created the regex but that regex needs some tweaks so that it doesn't capture any commas.

Regex - https://regex101.com/r/0PhCK4/1

CodePudding user response:

You can use a match for either the start of the string or a comma, and then use a capture group:

(?:^|,)([M]arche(?: \d{5}(?:-\d{4})?)?|(?:\d{5}(?:-\d{4})?)? [Mm]arche)\b

Explanation

  • (?:^|,) Either assert the start of the string or match a comma
  • ( Capture group 1
    • [M]arche Match marche or Marche
    • (?: Non capture group
      • \d{5} Match 5 digits
      • (?:-\d{4})? Optionally match - and 4 digits
    • )? Close te non capture group and make the whole group optional
    • | Or
    • (?:\d{5}(?:-\d{4})?)? Optionally match 5 digits with an optional - and a 4 digits part
    • [Mm]arche Match marche or Marche
  • ) Close group 1
  • \b A word boundary

Regex demo

  • Related