Home > front end >  Regex Algorythm Named Capture Group problem c#
Regex Algorythm Named Capture Group problem c#

Time:10-06

Capture Named Group not working as expected.

Target

I first search for the specific QR code and then work my way from right to left to find the amount as well as the date of the payment.

Flags

  • Global
  • Multiline
  • Right to Left

Data

Full Regex C#

(?(?!\d{2}\.\d{2}\.\d{2,4}).*\n|(?<Date>.*))*(?(?!CHF\s\d \.\d ).*\n|(?<Amount>.*))*85 03870 00000 00000 00001 00258

Full String

20.09.2022 Gutschrift QR-Rechnung 20.09.2022 919.05 39'228.62 Betrag CHF 809.10 ********* | ******* | 8704 Angaben ohne Gewähr Seite 10 / 39AEK BANK 1826 ********* · ********** Tel. ************ · www.*****.ch IID/BC-Nr. ******** · PC-Nr. ******** · CHE-******* MWST Kontoauszug Datum Buchungstext Valuta Belastung Gutschrift Saldo CHF Mitteilung / Referenz 85 03870 00000 00000 00001 00263 ************* Entschädigung ************* Betrag CHF 109.95 Mitteilung / Referenz 85 03870 00000 00000 00001 00258

Link: https://regex101.com/r/juS3iw/1

The Issue

The current Issue is that the pointer is in the wrong place after using (?(?!CHF\s\d \.\d ).*\n|(?<Amount>.*))* or (?(?!\d{2}\.\d{2}\.\d{2,4}).*\n|(?<Date>.*))* , so I cannot get the Named Capture Group like I tried in this example. The bottom Description says it does not consume characters, but here it seems that this is not the case.

Descripiton (?!...) Starting at the current position in the expression, ensures that the given pattern will not match. Does not consume characters.

Expected output

  • Amount: CHF 109.95
  • Date: 20.09.2022

Thanks for any help

Current Solution

(?<Date>\d{2}\.\d{2}\.\d{4}).*?(?<Valuta>\d{2}\.\d{2}\.\d{4}).*?(?<Currency>CHF|EUR).*?(?<Amount>\d \'\d \.\d{2}|\d \.\d{2})(?:(?!\d{2}\.\d{2}\.\d{2,4}).)*?85 03870 00000 00000 00001 00258

Do you think that's a good solution?

https://regex101.com/r/7bXOBg/1

CodePudding user response:

If it's the first date that you want, you can do it like this

(?<=\sCHF\s)(?<Amount>\d \.\d )(?!.*\sCHF\s)|^(?<Date>\d\d\.\d\d\.\d{4})

Explanation:

  • General form: select amount|select date where | means OR.
  • (?<=\sCHF\s) amount must follow " CHF "
  • (?<Amount>\d \.\d ) named capture group for amount.
  • (?!.*\sCHF\s) amount must not precede any whatever " CHF ". This ensures that we catch the last amount.
  • ^ must start at beginning of string.
  • (?<Date>\d\d\.\d\d\.\d{4}) named capture group for date.

However, if the date is always at the start of the string, you can get it more efficiently with string date = qrString.Substring(0, 10);, or, with a range (.NET Core only): string date = qrString[..10];

  • Related