In a very long string, I'd like to remove everything that starts with SPAIN:
until we get at least 4 capital letters in a row ending with a :
(ie. INDIA:
or SOUTH AFRICA:
or NORTHERN-IRELAND:
).
My attempt where A2 is the very long string:
=REGEXREPLACE(A2,"SPAIN: .*([A-ZÀ-ÿ\-\' ]{4,}): ","$1: ")
Unfortunately the above doesn't work as it's erasing everything until the very last 4 capital letters in the string.
Any ideas?
Sheet example to play with here:
CodePudding user response:
I have added a new sheet ("Erik Help") to your sample spreadsheet. The following formula is in B2 of that sheet:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(REGEXREPLACE(A2:A,IF(LEFT(A2:A,4)="ESPA","^. ("®EXEXTRACT(A2:A,"-:-([A-ZÀ-Ý]{4}[^:] )")&")","ESPAGNE[^~] -:-([A-ZÀ-Ý]{4}[^:] )"),"$1"),"ESPA. $","")))
This retrieves the correct result for all of the cases in your sample.