Home > OS >  How can I regex product titles to exclude sizes, but still capture lines without sizes?
How can I regex product titles to exclude sizes, but still capture lines without sizes?

Time:03-31

I've got list of product titles that I'm trying to remove sizes from. Some lines have sizes, some don't. The ones that do have one of a set of specific sizes (XS,S,M,L,XL,XXL,ONESIZE,Y6,Y8,Y10,Y12,Y14). I've been able to come up with an expression to capture everything up to the size, but I can't figure out how to also capture lines that don't have sizes.

Input:

Base Pant Y6
Thermal Half Zip Y8
Thermal Sweater Y10
Flare Jacket Y12
Crewneck Sweater
Racing Suit Y14
Long Day Jacket XS
Down Jacket S
Down Puffer M
Slalom Sweater
Racing Stripe Beach Short L
Stellar Stripe Training Track Pant XL
Hoodie XXL
Beanie Hat ONESIZE

My regex

(.*)(?: XS| S| M| L| XL| XXL| ONESIZE| Y6| Y8| Y10| Y12| Y14)$

Results in this: screenshot from regex101 (excludes Crewneck Sweater and Slalom Sweater)

Base Pant
Thermal Half Zip
Thermal Sweater
Flare Jacket

Racing Suit
Long Day Jacket
Down Jacket
Down Puffer

Racing Stripe Beach Short
Stellar Stripe Training Track Pant
Hoodie
Beanie Hat

Desired output (includes Crewneck Sweater and Slalom Sweater)

Base Pant
Thermal Half Zip
Thermal Sweater
Flare Jacket
Crewneck Sweater
Racing Suit
Long Day Jacket
Down Jacket
Down Puffer
Slalom Sweater
Racing Stripe Beach Short
Stellar Stripe Training Track Pant
Hoodie
Beanie Hat

CodePudding user response:

How about:

(.*?)(?: XS| S| M| L| XL| XXL| ONESIZE| Y6| Y8| Y10| Y12| Y14|)(?:\n|$)

CodePudding user response:

I like rabinzel's answer, but I'd alternatively recommend

(.*?)(?: XS| S| M| L| XL| XXL| ONESIZE| Y6| Y8| Y10| Y12| Y14)?$

And a minified version:

(.*?)(?: ((X*)(S|L)|M|ONESIZE|Y(6|8|10|12|14)))?$
  • Related