I want to capture the model of a phone but not the storage in the title. So I don't want the regex to match xxxGB
.
I am expecting to match:
iphone 13 from: "iphone 13 256gb - midnight"
iphone 13 pro max from "iphone 13 pro max 256gb - sierra blue"
iphone 13 pro from "iphone 13 pro 128gb - graphite"
galaxy tab a8 from "galaxy tab a8 wifi 128gb - grey"
The regular expression I have is
r'[A-Za-z] \s?[A-Za-z\ \.\d]*((\spro|\smax|\slight|\smini|\splus|\sultra|\[A-Za-z]?\d (?!gb)))*|$'
but the look behind only applied to the last number before "gb" not the entire number after the space
apple iphone 13 256gb - midnight
<re.Match object; span=(6, 18), match='iphone 13 25'>
<re.Match object; span=(32, 32), match=''>
apple iphone 13 pro 128gb - graphite
<re.Match object; span=(6, 22), match='iphone 13 pro 12'>
<re.Match object; span=(36, 36), match=''>
apple iphone 13 pro max 256gb - sierra blue
<re.Match object; span=(6, 26), match='iphone 13 pro max 25'>
<re.Match object; span=(43, 43), match=''>
samsung galaxy tab a8 wifi 128gb - grey
<re.Match object; span=(8, 21), match='galaxy tab a8'>
<re.Match object; span=(39, 39), match=''>
The testing template can be found from here: https://regex101.com/r/dn0Hyr/1
Many thanks!!
CodePudding user response:
You may use this regex to match phone models:
^[A-Za-z] (?: (?!wifi|\d*gb)[\dA-Za-z] )*
RegEx Details:
^
: Start[A-Za-z]
: Match 1 letters(?: (?!wifi|\d*gb)[\dA-Za-z] )*
: Delimited by space match 1 of letters or digits as long as word is notwifi
or digits followed bygb
. Repeat this group 0 or more times
CodePudding user response:
An alternation between two positive look ahead:
/^.*(?=\swifi\s\d{3})|^.*(?=\s\d{3})/gm
Segment | Meaning |
---|---|
^.* |
Starting with anything BUT a newline occurring zero or more times... |
(?=\swifi\s\d{3}) |
...is a match if it is before a space, literal "wifi", a space, and 3 digits... |
| |
OR |
^.* |
...starting with anything BUT a newline occurring zero or more times... |
(?=\s\d{3}) |
...is a match if it is before a space and 3 digits. |