Home > Back-end >  Problems with using Regex but don't know why
Problems with using Regex but don't know why

Time:09-28

so i am making an WPF application where you insert PDF files and it will convert to text, after that a few Regex functions will be used on the text to give me only the important parts of the pdf.

the first problem i am running into is with numbers, if the number for example is 6.90 it will come out as 6.9. I have tried changing my Regex but it wont make a difference.

the second problem i have is when with dates for example 09-06-2022 it just wont write anything i have also tried changing the Regex but it just wont show up.

anyone know why this is ?

this is a line in the PDF i use i am trying to only get 6.90

Date: 06-09-2022 € 5.70 € 1.20 € 6.90

this is the Regex is use to only get the Amount

(?<=Date\:?\s?\s?\s?\d{0,2}\-\d{0,2}\-\d{0,4}\s?\€\s\d{0,10}\.?\,?\d{0,2}\s?\€\s\d{0,10}\,?\.?\d{0,10}\s?\€\s)\d{0,10}\.\d{0,2}

this is the Regex i use to only get the Date

(?<=Date\:?\s?\s?\s?)\d{0,2}\-\d{0,2}\-\d{0,4}

There are a lot of "?" in it because i have to make it compatible to multiple different PDF

P.S. i didn't know how to ask something like this i am new to Stackoverflow

CodePudding user response:

This is much easier without Regex

            string input = "Date:  06-09-2022 € 5.70 € 1.20 € 6.90";
            string[] array = input.Split(new char[] {':', '€'});
            DateTime date  = DateTime.Parse(array[1]);
            decimal amount1 = decimal.Parse(array[2]);
            decimal amount2 = decimal.Parse(array[3]);
            decimal amount3 = decimal.Parse(array[4]);

CodePudding user response:

If you still want to use Regex, this is a much simpler solution

Date\:\s{0,}(\d{1,2}-?\d{1,2}-?\d{2,4}). (\d \.\d ). (\d \.\d ). (\d \.\d )

Breakdown


Date\:\s{0,} matches Date: followed by 0 or more spaces

(\d{1,2}-?\d{1,2}-?\d{2,4}) matches your date string accepting 1 or 2 numbers for month and day and 2 to 4 for year

. (\d \.\d ) matches any characters until it matches 1 or more numbers followed by . and 1 or more numbers. This is repeated 3 times to obtain the currency values

RegEx Storm Example

  • Related