i use the following regex to extract values that appear before certain units:
([.\d] )\s*(?:kg|gr|g)
What i want, is to include the unit of that specific value for example from this string :
"some text 5kg another text 3 g more text 11.5gr end"
i should be getting :
["5kg", "3 g", "11.5gr"]
can't wrap my head on how to modify the above expression to get the wanted result. Thank you.
CodePudding user response:
import re
p = re.compile('(?<!\d|\.)\d (?:\.\d )?\s*?(?:gr|kg|g)(?!\w)')
print(p.findall("some text 5kg another text 3 g more text 11.5gr end"))
CodePudding user response:
Other solution (regex demo):
(?i)\b\d \.?\d*\s*(?:kg|gr?)\b
(?i)
- case insensitive\b
- word boundary\d \.?\d*
- match the amount\s*
- any number of spaces(?:kg|gr?)
- matchkg
,g
orgr
\b
- word boundary
import re
p = re.compile(r"(?i)\b\d \.?\d*\s*(?:kg|gr?)\b")
print(p.findall("some text 5kg another text 3 g more text 11.5gr end"))
Prints:
['5kg', '3 g', '11.5gr']