I have the data:
P.C 115 P.B 372 Page 2 of 2
Subscriber Number 123456
Service Details Bill Period
SingleBill Educ Plans-Schools College 500Mb 500MO From 01/11/2021 To 30/11/2021
Static IP 1 From 01/11/2021 To 30/11/2021
Local Only From 01/11/2021 To 30/11/2021
Fixed Line Provisioning From 01/11/2021 To 30/11/2021
Discounts
Static IP 100% Rental Discount
Subscriber Number 763848
Service Details Bill Period
SingleBill Educ Plans-Schools College 300Mb 200AB From 01/11/2021 To 30/11/2021
Fixed Line Provisioning From 01/11/2021 To 30/11/2021
I want to get the "Subscriber Number" and corresponding "Discount" for each Subscriber, where "Discount" is available. Is there any possible way to do it using Regex.
I'm using PDF activities in the UiPath to read the text from PDF.
That 'Read PDF' activity is returning a String.
Then, I'm trying to write the regex to get the Subscriber Number
and Discount description
, for which Discount is eligible using look ahead and look behind in regex.
I am trying (?<=Subscriber Number)(.*)(?=\n)
and I'm able to capture the Subscriber Number
, but not the text in between Subscriber Number
and newline.
CodePudding user response:
You can capture both values with
(?m)^Subscriber\s Number\s (\d )(?:\r?\n(?!Discounts). )*\r?\nDiscounts\s (. )
See the regex demo. Details:
(?m)^
- start of a lineSubscriber\s Number\s
-Subscriber
, one or more whitespaces,Number
, one or more whitespaces(\d )
- Group 1: one or more digits(?:\r?\n(?!Discounts). )*
- any zero or more repetitions of\r?\n
- an optional carriage return and then a line feed char(?!Discounts).
- a non-empty line that does not start withDiscounts
\r?\n
- an optional carriage return and then a line feed charDiscounts
- aDiscounts
string\s
- one or more whitespaces(. )
- Group 2: any one or more chars other than a line feed char.