Home > Blockchain >  Regex that matches only the first occurence of a price pattern in a string
Regex that matches only the first occurence of a price pattern in a string

Time:06-16

I have multiple strings about different products of which some contain only the price of the acutal product but some also extra costs. My problem here is that I only want to extract the price of the actual product out of the string and not any additional costs. The actual price always appears as the first price in the string which is why I tried to solve it with a lazy operator that stops after the first occurence of a specific pattern. However this does not seem to solve my issue.

Example Strings:

2-TB SSD, black, 200.- EUR Extra Costs. Tel. 1234/12345678 oder Tel. 1234/12345678

PC, white case, 320.- price 62.- delivery 95.- setup

PC, black case, 320.- price 62.- delivery 95.- setup

2-TB SSD, white, 200.- EUR, Tel. 1234/12345678 oder Tel. 1234/12345678

My current regex: \d (?=(\.-)).*?

I basically want to return the digits right before the first occurrence of (.-) in the string. This is done in Java.

CodePudding user response:

You must use the ".findall()" method from "re" library. Search about it in w3school.com and solve your problem. It is very easy

https://www.w3schools.com/python/python_regex.asp#findall

Import re
X = re.findall(your pattern, your text)

CodePudding user response:

You might use a capture group:

^.*?\b(\d )\.-

The pattern matches:

  • ^ Start of string
  • .*? match as least as possible chars
  • \b a word boundary to prevent a partial word match
  • (\d ) capture 1 digits in group 1
  • \.- match a dot and a hyphen

Regex demo

A pcre variant for a match only:

^.*?\K\d (?=\.-)

Regex demo

  • Related